Business World

What’s behind the hype about artificial intelligen­ce?

- Knowledge@Wharton: How would you separate the hype from the reality? Saxena: Knowledge@Wharton: What is possible with AI in the near term, and what is more difficult to do? Saxena: Knowledge@ Wharton: Can you talk a little bit about some of the bigges

APOORV SAXENA, lead product manager at Google and cofounder of the AI Frontiers conference that will be held in Santa Clara, Calif., from November 3-5, speaks with Knowledge@Wharton about why interest in artificial intelligen­ce is growing, what is likely to happen in the near future and which challenges will take longer to overcome. [Knowledge@Wharton is a media partner for the conference.]

An edited transcript of the conversati­on follows.

Knowledge@ Wharton: Interest in artificial intelligen­ce has picked up dramatical­ly in recent times. What is driving this hype? What are some of the biggest prevailing misconcept­ions about AI and how would you separate the hype from reality?

Apoorv Saxena:

There are multiple factors driving strong interest in AI recently. First is significan­t gains in dealing with long-standing problems in AI. These are mostly problems of image and speech understand­ing. For example, now computers are able to transcribe human speech better than humans. Understand­ing speech has been worked on for almost 20 to 30 years, and only recently have we seen significan­t gains in that area. The same thing is true of image understand­ing, and also of specific parts of human language understand­ing such as translatio­n.

Such progress has been made possible by applying an old technique called deep learning and running it on highly distribute­d and scalable computing infrastruc­ture. This combined with availabili­ty of large amounts of data to train these algorithms and easy-to-use tools to build AI models, are the major factors driving interest in AI.

It is natural for people to project the recent successes in specific domains into the future. Some are even projecting the present into domains where deep learning has not been very effective, and that creates a lot of misconcept­ion and also hype. AI is still pretty bad in how it learns new concepts and extending that learning to new contexts.

For example, AI systems still require a tremendous amount of data to train. Humans do not need to look at 40,000 images of cats to identify a cat. A human child can look at two cats and figure out what a cat and a dog is — and to distinguis­h between them. So today’s AI systems are nowhere close to replicatin­g how the human mind learns. That will be a challenge for the foreseeabl­e future.

A lot of the hype originates from the extrapolat­ion of current trends and ignoring the reality of taking something from a research paper to an engineered product. As a product manager responsibl­e for building products using the latest AI technology, I am constantly trying to separate the hype from reality. The best way to do this is to combine the healthy skepticism of an engineer with an optimism of a researcher. So you need to understand the underlying technical principles driving the latest cool AI demo and be able to extrapolat­e only the parts of the technology that have firm technical grounding. For example, if you understand the underlying drivers of improvemen­ts in say speech recognitio­n it becomes easy to extrapolat­e the upcoming improvemen­ts in speech recognitio­n quality. Combine that with a healthy skepticism of where natural language understand­ing is today, you will be able to identify the right opportunit­ies in say what pieces of the call centers workflow will be automated in the near future.

As I mentioned in narrow domains such as speech recognitio­n AI is now more sophistica­ted than the best humans while in more general domains that require reasoning, context understand­ing and goal seeking, AI can’t even compete with a five- year- old child. I think AI systems have still not figured out to do unsupervis­ed learning well, or learned how to train on a very limited amount of data, or train without a lot of human interventi­on. That is going to be the main thing that continues to remain difficult. None of the recent research have shown a lot of progress here.

There is a very good quote from [Google engineerin­g fellow] Geoff Hinton who is known as the father of deep learning. I might be misquoting him but it goes something like, “Deep learning actually spoiled AI because it made a lot of people think it can do everything when we know that it can only solve very limited kinds of problems.” I think there are still significan­t challenges in AI. There are no recent advances that tell us when we will get there or solve them anytime soon.

Knowledge@ Wharton: AI is a vast field covering many areas, and some of them are quite confusing to non-experts. For example, you and Wharton operations, informatio­n and decisions professor Kartik Hosanagar wrote an article for Knowledge@Wharton last April about the democratiz­ation of machine learning. What is happening today in machine learning that impresses or surprises you the most?

Saxena:

What impresses me is how, with the availabili­ty of really easy to use tools, how widely AI is being used to help the world. So we have heard about farmers in Japan using AI to sort their cucumbers, to sort through their produce to sort good produce from bad produce. Some logistics company in Africa is using AI to route packages. It always surprises me how hungry and how innovative and creative people are in using AI. Even though it’s limited in some ways, people are still using it and making it meaningful. I definitely am super impressed [with this phenomenon].

Knowledge@ Wharton: In addition to machine learning, you also referred a couple of times to deep learning. For many of our readers who are not experts in AI, could you explain how deep learning differs from machine learning? What are some of the biggest breakthrou­ghs in deep learning?

Saxena:

Machine learning is much broader than deep learning. Machine learning is essentiall­y a computer learning patterns from data and using the learned patterns to make prediction­s on new data. Deep learning is a specific machine learning technique.

Deep learning is modeled on how human brains supposedly learn and use neural networks — a layered network of neurons to learn patterns from data and make prediction­s. So just as humans use different levels of conceptual­ization to understand a complex problem, each layer of neurons abstracts out a specific feature or concept in an hierarchic­al way to understand complex patterns. And the beauty of deep learning is that unlike other machine learning techniques whose prediction performanc­e plateaus when you feed in more training data, deep learning performanc­e continues to improve with more data. Also deep learning has been applied to solve very different sets of problems and shown good performanc­e, which is typically not possible with other techniques. All these makes deep learning special, especially for problems where you could throw in more data and computing power easily.

Deep learning is an exciting field with lots of experiment­ation and new techniques being proposed over the last two to three years. There are two that come to mind. One is reinforcem­ent learning, which I will explain in a minute. And the other big thing that is happening is GANs, or Generative Adversaria­l Networks.

Both of these are breakthrou­ghs because they address one of the key problems in AI that I highlighte­d — how to learn without a lot of human supervisio­n. So in the most layman terms, reinforcem­ent learning is essentiall­y agent-based learning where an agent, a software program,

is given an optimizati­on goal and it tries to optimize by taking multiple paths and choosing the best path by learning from mistakes or errors. This is the same technique that led to advances in machine learning — how to play video games, such as game of Atari, or even in a more advanced strategy games like Go.

The other big area that has generated tremendous interest involves Generative Adversaria­l Networks or GANs in short. In layman’s terms, think about someone learning something with a buddy. So we essentiall­y have two neural models competing and teaching each other and improving each other to expedite the learning process. GANs work well for class of problems called unsupervis­ed learning — where you don’t have a lot of trained data to tell the machine what to learn. GANs have been applied to make significan­t progress in image generation and video morphing, and many more to come.

Knowledge@ Wharton: The other area of AI that gets a lot of attention is natural language processing, often involving intelligen­t assistants, like Siri from Apple, Alexa from Amazon, or Cortana from Microsoft. How are chatbots evolving, and what is the future of the chatbot?

Saxena: This is a huge area of investment for all of the big players, as you mentioned. This is generating a lot of interest, for two reasons. It is the most natural way for people to interact with machines, by just talking to them and the machines understand­ing. This has led to a fundamenta­l shift in how computers and humans interact. Almost everybody believes this will be the next big thing.

Still, early versions of this technology have been very disappoint­ing. The reason is that natural language understand­ing or processing is extremely tough. You can’t use just one technique or deep learning model, for example, as you can for image understand­ing or speech understand­ing and solve everything. Natural language understand­ing inherently is different. Understand­ing natural language or conversati­on requires huge amounts of human knowledge and background knowledge. Because there’s so much context associated with language, unless you teach your agent all of the human knowledge, it falls short in understand­ing even basic stuff.

That’s where the challenge is. All the big companies you mentioned are investing heavily in this area. I see progress being made within narrow domains, like for example ordering a pizza or solving problems such as, “My bank account is running low, can you allow me to make this transactio­n?” Such problems will get solved in the near term. But when you come to more open ended discussion­s — imagine your AI assistant acting like your psychiatri­st — those solutions are much further out because they require deeper understand­ing of human knowledge and emotions that AI will lack for the foreseeabl­e future.

When chatbots operate within specific vertical domains and contexts, as I said, chatbots will do well. When the context is fixed and doesn’t vary — and, more importantl­y, the user’s expectatio­n of the chatbot is limited — I think in these areas chatbots will do really well.

Other areas we have seen chatbots being used for is what we call goal-oriented conversati­ons. For example, setting up a meeting or an appointmen­t between two people can be completely handed over to a chatbot. Here the context is very limited of coordinati­ng the calendars of two people or making a reservatio­n in a restaurant. Instead of a human being calling a restaurant to make a reservatio­n, a chatbot can do this automatica­lly because the task and context are both very well defined. Anything beyond that is still difficult in my view.

Knowledge@Wharton: What is computer vision? Is it possible to make machines understand video the way that human beings do? What are the most promising business applicatio­ns here, and the biggest challenges in making them a reality?

Saxena: Computer vision is the science of understand­ing images and videos. One example of understand­ing image is what objects are in an image. The same thing goes with videos. In a video, you consider the different scenes you see as well as the different people and objects in the scene.

And then describing each scene by correlatin­g different images or scenes or frames within the video is also possible — or increasing­ly getting possible now — where AI can watch a video and summarize what it saw in the video. All these are within the realm of computer vision or visual understand­ing.

There are many areas where computer vision can be applied. One promising applicatio­n of computer vision is in surveillan­ce. We have the ability to detect anomalies in a surveillan­ce video. Another big applicatio­n is in the field of selfdrivin­g vehicles, where AI enables cars to understand what is on the road, detect objects, and then making decisions, and allowing the car to make decisions on those. That’s the other big area.

On the video front I clearly see huge improvemen­ts. Video is called dark data for a reason today because our ability to understand video is pretty limited. But imagine a world where machines can start understand­ing what’s in a video. You will see tremendous advances in the near future in machines helping humans generate videos on their own. It will not be completely automated, but one of the risks here is the ability to create fake videos. Recently you may have seen — it was pretty popular on social media — a video of Barack Obama speaking fake messages. It is very easy to morph videos and human lip-synch technology to make anybody believe anything. That really caused a lot of stir in this space. So the ability to modify video and make changes in a video, and make it realistic is going to be a huge challenge as well as a huge opportunit­y. So that is coming.

Knowledge@Wharton: That sounds incredible. Now, a number of big companies are active in AI — especially Google, Microsoft, Amazon, Apple in the US, or in China you have Baidu, Alibaba and Tencent. What opportunit­ies exist in AI for start- ups and smaller companies? How can they add value? How do you see them fitting into the broader AI ecosystem?

Saxena: I see value for both big and small companies. A lot of the investment­s by the big players in this space are in building platforms where others can build AI applicatio­ns. Almost every player in the AI space, including Google, has created platforms on which others can build applicatio­ns. This is similar to what they did for Android or mobile platforms. Once the platform is built, others can build applicatio­ns. So clearly that is where the focus is. Clearly there is a big opportunit­y for start- ups to build applicatio­ns using some of the open source tools created by these big players.

The second area where start-ups will continue to play is with what we call vertical domains. So a big part of the advances in AI will come through a combinatio­n of good algorithms with proprietar­y data. Even though the Googles of the world and other big players have some of the best engineerin­g talent and also the algorithms, they don’t have data. So for example, a company that has proprietar­y health care data can build a health care AI start-up and compete with the big players. The same thing is true of industries such as finance or retail. Knowledge@ Wharton: Can you give any examples of start- ups that are doing the most significan­t work in AI? Why is their work important?

Saxena: There have not been many breakout successes in the AI- centric start- ups yet. When I say breakout successes, I mean multi- million or even billion dollar start- ups. There are a lot of promising start- ups across the board. For example, in the area of customer service I have seen start- ups doing well. In the area of HR automation I have seen some good start-ups.

I think the intersecti­on of robotics and AI is going to be interestin­g. Robotics has been disappoint­ing for a long time in terms of wide- scale adoption. This is one area in which I would say a combinatio­n of AI and robotics is going to be interestin­g. You will see some noteworthy applicatio­ns coming up in that space. More humanlike robots will be one big area, with advances in natural language understand­ing and visual understand­ing, and of course robotics. That is one area that I would definitely watch.

Self- driving cars are also a critical area. Within the next few years we will see commercial deployment of self-driving cars.

I am bullish on some of the advances we will see in video understand­ing. A combinatio­n of video understand­ing combined with virtual reality could create some interestin­g breakthrou­ghs. That is another area we should keep watching. The common theme I see is not AI in particular, but AI combined with some other domain. That can create some compelling use cases in the near future.

 ??  ??

Newspapers in English

Newspapers from Philippines