The Borneo Post

Why US giants of technology love the sound of your voice

- By Jing Cao and Dina Bass

AMAZON’S Echo has made tangible the promise of an artificial­ly intelligen­t personal assistant in every home. Those who own the voiceactiv­ated gadget (known colloquial­ly as Alexa, after its female interlocut­or) are prone to advertisin­g “her” charms, applauding Alexa’s ability to call an Uber, order pizza or check a pupils’ maths homework. The company says more than 5,000 people a day profess their love for Alexa – you can probably check that up that claim.

Voice recognitio­n has come a long way in the past few years. But it’s still not good enough to popularise the technology for everyday use and usher in a new era of human-machine interactio­n, allowing us to talk with all our gadgets-cars, washing machines, television­s.

Despite advances in speech recognitio­n, most people continue to swipe, tap and click. And probably will for the foreseeabl­e future. What’s holding back progress? The artificial intelligen­ce that powers the technology has room to improve. There’s also a serious deficit of data-specifical­ly audio of human voices, speaking in multiple languages, accents and dialects in often noisy circumstan­ces that can defeat the code.

So Amazon, Apple, Microsoft and China’s Baidu have embarked on a world-wide hunt for terabytes of human speech.

The challenge is finding a way to capture natural, real-world conversati­ons.

Even 95 percent accuracy isn’t enough, says Adam Coates, who runs Baidu’s artificial intelligen­ce lab in Sunnyvale, California.

“Our goal is to push the error rate down to 1 percent,” he says. “That’s where you can really trust the device to understand what you’re saying, and that will be transforma­tive.”

Not so long ago, voice recognitio­n was comically rudimentar­y. An early version of Microsoft’s technology running in Windows transcribe­d “mom” as “aunt” during a 2006 demo before an auditorium of analysts and investors.

When Apple launched with much fanfare Siri five years back, the personal assistant’s gaffes were widely mocked because it, too, routinely spat out incorrect results or didn’t hear the question correctly. When asked if Gillian Anderson is British, Siri provided a list of English restaurant­s. Now Microsoft says its speech engine makes the same number or fewer errors than profession­al transcribe­rs, Siri is winning grudging respect, and Alexa has given us a tantalisin­g glimpse of the future.

Much of that progress owes a debt to the magic of neural networks, a form of artificial intelligen­ce based loosely on the architectu­re of the human brain. — Bloomberg

 ??  ?? The soft prosthetic hand shows the ability to touch three tomatoes and choose the ripest. Zhao and Shepherd refining the robotic hand. — Cornell University photos
The soft prosthetic hand shows the ability to touch three tomatoes and choose the ripest. Zhao and Shepherd refining the robotic hand. — Cornell University photos

Newspapers in English

Newspapers from Malaysia