The eleventh-hour evolution of Amazon’s voice-controlled surprise hit, the Echo
▶ How Amazon’s voice-controlled speaker became a surprise hit ▶ “We want to be a large company that’s also an invention machine”
In the fall of 2014, some of the men and women building Amazon.com’s new voice- controlled smart speaker felt they needed to confront Chief Executive Officer Jeff Bezos. The release of the speaker was looming, and things were falling into place. The device looked good, its voice recognition software was improving quickly, and the boxes it would ship in had been designed and assembled. But there was a lingering issue with the device’s name: the Amazon Flash. Many people who worked at Lab126, the hardware division, hated it, according to two former employees. Bezos was strongly in favor.
There was another worry. A core feature of the device is a “wake word” that, when spoken, cues it to take voice commands. One of the two words being considered was “Alexa.” Bezos thought the best word would be “Amazon.” This presented a challenge: The speakers would wake upon hearing Amazon ads on television and, because it connects to a Wi-fi network, could start buying stuff from the Internet.
Generally, the engineers and product managers at Lab126 stifled their misgivings, instead concentrating on giving the boss what they thought he wanted. But weeks before the speaker was set to ship, the dissidents confronted Bezos. He agreed to the changes: The device would be called the Echo, and its wake word would be “Alexa.”
In a gadget landscape dominated by rectangular touchscreens, the Echo stands out—a screenless cylinder, just over 9 inches tall and 3¼ inches in diameter. It plays music and answers basic household questions like, “how many teaspoons are in a cup?” (about 48). The only way to interact with the Echo is to talk to it. It’s always listening for its wake word, which users can change to “Amazon” or “Echo” if they want.
When the first batch shipped to buyers in November, Amazon’s critics mocked it. Some called the Echo a useless gimmick; others pointed to it as evidence of Amazon’s Orwellian tendencies. Then something weird happened: People decided they loved it. Amazon never releases data about how its products sell, but Consumer Intelligence Research Partners issued a report on April 6 saying that the
company has already sold more than 3 million of the devices, 1 million of them during the 2015 holiday season. About 35,000 people have reviewed the speaker on Amazon.com, giving it an average 4.5 stars out of 5.
Dozens of independent developers are writing apps that work with the speaker’s voice controls to do things like turn off lights or order a pizza. The Echo, which seemed like a superfluous toy at first, now looks like a way for Amazon to become the default choice for voice-controlled devices.
“We want to be a large company that’s also an invention machine,” Bezos wrote in a letter to investors in April. The Echo is what happens when Amazon achieves that goal. Bezos declined an interview for this story. Ten current and former Amazon employees agreed to talk, mostly on the condition of anonymity, because they weren’t authorized to do so by the company.
Echo comes out of Lab126, which was created in 2004 to build the Kindle e-reader. The lab’s name refers to the alphabet, with 1 representing the letter A and 26 representing Z. The Kindle was Project A. Project B was Amazon’s smartphone. Little is known about Project C, but a review of Amazon patent applications suggests a device that would display augmented-reality images that people could interact with. The Echo—project D—was an offshoot of Project C. It started in 2011, and at one point several hundred employees from Seattle to Cambridge worked on it.
The Fire Phone, which was unveiled in June 2014, was a colossal flop. After it failed, Amazon moved some of the people who worked on it to the Echo team. They brought with them different ideas and varying levels of enthusiasm about the speaker, which grated on some who had been on the Echo since the beginning. It didn’t help that the stakes had gotten higher, since the speaker had to redeem Amazon’s reputation and dispel doubts—inside
and outside—about whether Amazon could make desirable highend consumer gadgets at all.
As originally conceived, the Echo was simpler and cheaper. One person who worked on the project remembers that the company expected to manufacture the devices for about $17 and sell them for $50. They now retail for $180. Bezos had lots of ideas about the speaker’s main purpose. “Jeff had a vision of full integration into every part of the shopping experience,” says one person who was at Lab126 at the time.
Amazon hired a handful of people from the speech recognition company Nuance Communications. It bought two voice response startups, Yap and Evi, to build a speech recognition system that could match those created by Google and Apple for their smartphones.
Once Amazon’s engineers started building the Echo, they realized it would need more processing power than they’d anticipated. They swapped out the microcontroller, the kind of
simple computer used to run such devices as remote controls, with a microprocessor, which could handle more complex tasks.
The Echo went through several key changes at the eleventh hour. The speaker had to emit and listen for sound simultaneously, a challenge that had preoccupied engineers throughout development. What if the music on the Echo was so loud it drowned out people’s voices? Early on, engineers created prototypes for smaller devices that looked like hockey pucks, to be placed around the house to listen for commands when people strayed too far from the main speaker. That idea was later pushed aside to focus on the main device. It recently reemerged as the Echo Dot, which Amazon is selling on a limited basis.
By the fall of 2014, there was still disagreement over whether the Echo’s hearing was good enough on its own. Bezos and his top deputies were adamantly opposed to relying on any form of input other than voice control. They saw it as cheating. Some engineers disagreed, pushing for a remote that people could speak into from anywhere in the house. As it happened, the company had just made such a remote for the Fire TV, its streaming media player. The two sides agreed to send the first batch of speakers with a remote included. They’d gather information about how often people used it and tweak the product accordingly. Apparently the fears were overblown. The people using the Echo in their homes almost never touched the remote, and it was quietly removed from the box.
Connecting the Echo to Internetenabled lightbulbs and thermostats made by other companies wasn’t a focus within Lab126 until late 2014. On a lark, an engineer had rigged the speaker to work as a voice controller for a streaming TV device. “It was something he grew to embrace, aggressively,” one employee says of Bezos’s reaction. Amazon’s vision for the Echo now relies heavily on the speaker serving as a hub for the so-called smart home. Dave Limp, Amazon’s senior vice president for devices, jokes that it’s only a matter of time before some enterprising developer writes a program to use the Echo’s voice controls to flush the toilet.
With the Echo, Amazon has found a way to insert itself into customer interactions with other devices and services. Part of this is good timing. The tech industry has been searching for the next big computing platform after mobile. Investing in some combination of voice control and artificial intelligence was prudent, and no one else has quite figured it out yet. Apple, Google, and Microsoft all have virtual assistants, but they were made to make smartphones work better.
The Echo has more than 500 “skills,” or voice- controlled apps, that can check your bank balance or make your child’s favorite animal noises. The company keeps an internal list of customer suggestions for new controls, ranking each one according to popularity to decide the order in which it will pursue them.
The next big challenge for Amazon is to begin tying services together in new ways, says Julie Ask, an analyst at Forrester Research. She says being able to tell the Echo to call an Uber is fun, but incremental. “In five years, my Echo will say, ‘Hey, it’s about time to go to the airport. Should I get you a car?’ And I’ll just say yes,” she says. “That’s the difference between where we are today and where we want to be.”
�Joshua Brustein, with Spencer Soper
“Jeff had a vision of full integration into every part of the shopping experience.” ——Former Lab126 employee