Waterloo Region Record

Alexa, can you have a conversati­on? (Sort of )

- KEITH COLLINS AND CADE METZ

Digital assistants like Amazon’s Echo can listen to you. And they can talk back. But that doesn’t mean they can carry on a good conversati­on.

As the devices that run these assistants become more commonplac­e — 39 million Americans now own one, according to a recent study — Amazon and competitor­s like Apple and Google foresee a day when you can chat with their assistants like you would with a friend.

We explored the conversati­onal limits of the leading assistants. After consulting with the companies involved and a few artificial intelligen­ce experts we created tests that show what they can and can’t handle.

Don’t expect the assistants to replace conversati­ons with friends anytime soon. But the experiment­s — even in the moments when the assistants screwed up — showed what these assistants could one day become in our lives.

Let’s make a list

Because these devices are not yet nimble enough to understand and respond to anything you might say, Amazon, Apple, and Google must hone them for particular tasks, like playing a song or building a shopping list. If you stick to these tasks, the devices are pretty good at zeroing on what you want. But even inside these small domains, they can trip up. Listen how.

Add these things to my shopping list …

AMAZON ECHO: What can I add for you?

APPLE HOMEPOD: I added “these things” to the shopping list.

GOOGLE HOME: What do you want to add?

Right away, the HomePod showed how difficult it can be for machines to understand natural language. The device recognized what we were doing, but misunderst­ood how we wanted to do it, and added an item called “these things” to a shopping list.

… guacamole, chips, tortillas.

AMAZON ECHO: “Guacamole chips tortillas” added to your shopping list.

APPLE HOMEPOD: No longer listening.

GOOGLE HOME: OK, I’ve added “guacamole” and “chips tortillas” to your shopping list.

If your roommate asked what you wanted from the market and you said “guacamole, chips, tortillas,” he would understand that as three separate things. But the Google and Amazon devices didn’t quite recognize the pause between items as a verbal comma. Although these devices are pretty good at recognizin­g what you say, they may not completely understand it.

A digital assistant relies on many different technology systems, all working together on the device and inside a network of computer data centers that connect to the assistant over the internet.

When you say something, one system tries to recognize each word and convert it to text. Another system tries to understand the meaning of each word and how it relates to the others. A third system spits out new text that responds to what you’ve said. A fourth converts this response to digital speech. Other systems may also weigh in, but you get the point.

Engineers used to build speech recognitio­n systems by writing one small computing rule at a time — a painstakin­g process. But so-called neural networks are now replacing those handwritte­n rules, accelerati­ng the progress of speech recognitio­n. Neural networks are complex mathematic­al systems that can learn particular tasks by pinpointin­g patterns in large amounts of data. They can learn to recognize spoken words, for instance, by analyzing recordings of old customer support calls.

In some cases, the assistants perform as well as human listeners. But listening is only part of what they do.

Resolving ambiguity

Even if you make a rather specific request, it can have multiple meanings. And that can be tough for an assistant to deal with.

Who won the Giants game last night?

AMAZON ECHO: On December 31st, the Giants beat the Redskins 18 to 10. They’ll play on August 9th at 7 p.m. at home against the Browns.

APPLE HOMEPOD: Which Giants? New York Giants, or San Francisco Giants?

GOOGLE HOME: Yesterday, the Giants played the Padres. The final score was 3 to 2, Giants, in zero innings.

If you’re a sports fan, you know there are two profession­al teams that call themselves the Giants. If someone asks about a Giants game in the summer, you know they’re talking about the San Francisco Giants, who play baseball.

When we asked on Aug. 2, before preseason football started, Google Home came to the same conclusion. And the HomePod asked that we identify the team we were talking about. Google and Apple likely have written an explicit rule for this particular situation.

Alexa guessed that we meant the New York Giants football team, which hadn’t played a game in seven months, but this may have been because we were asking the question in New York. Amazon sometimes generates responses based on your location. In any case, Alexa figured out which team we meant after the next question.

Who do they play next?

AMAZON ECHO: First pitch of the NL West matchup between the Giants and the Diamondbac­ks is scheduled for tomorrow night at 9:40 p.m. This season, the Giants have bested the Diamondbac­ks in seven of the 12 games they’ve played.

APPLE HOMEPOD: The Giants battle the Diamondbac­ks tomorrow at 9:40 p.m.

GOOGLE HOME: Their next game is tomorrow at 9:40 p.m., when they will play the Diamondbac­ks.

When you chat with a friend, you use pronouns. To keep up, a digital assistant must recognize pronouns for what they are and map them back to words in a previous exchange. Algorithms are learning to identify pronouns by analyzing hundreds of sentences, detecting patterns in the syntax.

After recognizin­g what you say, these devices must then understand what you want and generate a response. For that part, companies like Amazon and Google have also begun to rely on neural networks to learn from what people have asked these digital assistants in the past.

Engineers might train one neural network to identify what you are trying to do (build a shopping list, get the score of a baseball game) and train another to parse a sentence and identify the nouns, pronouns, verbs and other parts of speech. It’s like elementary school grammar classes for computers.

All of the machine learning algorithms these devices use are liable to make mistakes, though, and the system can be brittle because so many different algorithms are trying to work in concert. A digital assistant may screw up if you make a small change to a request that has worked countless times in the past. It may also screw up if you don’t change anything.

Machine learning algorithms can help these devices deal with turn-by-turn exchanges. But each verbal exchange is limited to a simple, three- or four-turn conversati­on. In a perfect world, engineers would build one giant neural network that learns to do everything. For now, these devices move toward better conversati­ons in small steps.

Understand­ing subtext

In the previous example, the devices understood what we were asking because they mapped a pronoun back to the subject of our previous request. But what if we don’t use a pronoun?

Where is the nearest coffee shop?

AMAZON ECHO: Here are a few popular ones: Ramini Espresso Bar & Cafe, Best Bagel and Coffee, Paris Baguette and Cafe Grumpy. Check out the Alexa app for contact info and hours.

APPLE HOMEPOD: The nearest one I found is Bluestone Lane, located quite close to you on Times Square, which averages four stars according to Yelp. It’s open today, from 6:30 a.m. to 8 p.m.

GOOGLE HOME: The address for Culture Espresso is 307 West 38th Street, New York, New York, 10018.

The devices correctly identified nearby shops, but they were all different. That’s likely because each device is using a different “knowledge graph,” a vast database of facts and other informatio­n you may ask for. Google, for example, may use data gathered from Google Maps, whereas the other companies may pull from sources like Yelp.

Tackling opinions

Don’t expect these devices to tell you what they think — or even what other people think. It’s easier to build technology that sticks to the facts. After asking these devices to identify the photograph­er Robert Mapplethor­pe, which they all did correctly, we asked for their opinion of his work. None of the machines seemed to grasp the concept of this question, and Google eventually lost the context altogether.

What do you think of his work?

AMAZON ECHO: I don’t have an opinion on that.

APPLE HOMEPOD: I really couldn’t say.

GOOGLE HOME: That seems like an interestin­g topic. What do you want to know about Robert Mapplethor­pe?

 ?? JENS MORTENSEN NYT ?? Apple's HomePod speaker, Amazon's Echo speaker, and Google's Home Smart Speaker. Digital assistants like these speakers can listen to you. And they can talk back. But is it a conversati­on?
JENS MORTENSEN NYT Apple's HomePod speaker, Amazon's Echo speaker, and Google's Home Smart Speaker. Digital assistants like these speakers can listen to you. And they can talk back. But is it a conversati­on?

Newspapers in English

Newspapers from Canada