IN THEIR OWN WORDS

Patients were told their voices could disappear. They turned to AI to save them

2023-06-20 - AMANDA MORRIS, ALEXA JULIANA ARD and SZU YU CHEN

Ron Brady was 52 years old when he was diagnosed with ALS, which stands for amyotrophic lateral sclerosis, a neurodegenerative disease that eventually causes most people to lose their ability to speak, walk or breathe.

Now, at 55, he can't swallow food, and it's getting harder to brush his teeth and put on clothes. He likes to crack jokes, but his speech is slurred to the point where few understand him.

But he has not lost his voice. That's because he preserved his voice with a company called Voice Keeper, which is one of several companies using artificial intelligence to “bank” people's voices while they are still able to speak and recreates those voices for textto-speech software.

Voice banking used to be expensive and time-consuming, but AI has made it more accessible to people with conditions that could impact their ability to speak, such as ALS, throat cancer, cerebral palsy and Parkinson's disease.

Patients say having a computer-generated voice that sounds like their real voice has given them a greater sense of confidence and connection to the world around them.

Brady's synthetic voice isn't a perfect match — his speech was already impaired when he recorded himself. But it has the same relaxed, deep tone, which he jokingly calls “suave.”

“My favourite thing to say is any corny dad comment that will make my wife or adult children laugh,” he said.

To Brady, getting his voice back felt like getting parts of himself back: the school administrator who commanded a room with confidence, the gregarious, talkative father, and the first college graduate in his family, whose neutral American accent was very different from the Caribbean accent of his immigrant parents.

The use of artificial intelligence has driven a surge in voice banking, particularly among ALS patients. Capturing human speech is incredibly complex. Previously, a person might have to record 1,000 to 6,000 sentences to capture every possible sound in a language. The process typically took eight to 30 hours. Those recorded sounds then went into a database, and the software would rearrange the sounds to form words and phrases.

The method is known as unit selection, and the results were “choppy,” said Tim Bunnell, director of the Nemours Center for Pediatric Auditory and Speech Sciences.

“It's intelligible, but it's very jarring,” Bunnell said. “Our unit selection voices don't sound as good as a human voice.”

His research laboratory has transitioned from older methods of speech synthesis to newer methods, such as those using artificial intelligence.

To create a digital voice, AI software analyzes a person's speech sample and then quickly scours a large database to find people speaking in similar ways. It finds patterns in how voices sound and creates a digital voice to match an individual speaker.

Most companies now only need a few hundred sentences to get enough data. But some, like Acapela Group, which partners with Team Gleason Foundation, have algorithms that can build a voice from just 50 sentences.

The use of AI has also made voice

banking more affordable. Acapela Group charged US$3,000 when the company relied on unit selection, but with AI, the cost is now $999. Other companies offer the service for as low as $300. Voice banking is not covered by insurance, but most companies don't charge people unless they start using their synthesized voices.

John M. Costello, who has worked with thousands of patients as the director of the Augmentative Communication Program at Boston Children's Hospital, recommends patients work with a speech language pathologist to figure out which product best matches their capabilities and needs. He has noticed patients with realistic voices have more meaningful connections with loved ones.

“Personal voice is so important to our relationships,” he said. “There is a psychological response.”

Research shows hearing one's mother's voice releases similar levels of oxytocin as getting a hug from her. Oxytocin is a social bonding hormone linked to lower levels of stress and anxiety. Another study found that self-agency is enhanced by hearing one's own voice.

The ease of the newer technology convinced Anna Paula Pereira Hülle Mateus, 51, of Lafayette, Calif., to try it. When she was diagnosed with ALS in July 2022, she was hesitant to spend her energy focusing on what she might lose. She changed her mind once

a doctor told her that voice banking would take about an hour and could ensure she would always have her voice.

“Now I'm very glad that I did it, because I feel that my speech is getting worse and worse,” she said.

But the company she used, Acapela Group, doesn't offer the service in Portuguese, which is her native language. To offer voice banking in different languages, companies need to develop separate algorithms for each language.

Pereira Hülle Mateus's synthetic voice in English is a little flatter but still captures her Brazilian accent.

However, she doesn't plan to listen to her synthetic voice until the moment she needs to use it — which she hopes will never come.

Each company has its own method for capturing speech, using different sentences and algorithms. So if someone banks their voice with one company, then loses the ability to speak, they could be stuck with that company, said Blair Casey, executive director of Team Gleason Foundation, the non-profit that helps ALS patients.

He has been pushing for companies to create a standardized set of phrases that can be used with any of their algorithms, so that customers can comparison shop. He's also pushing companies to give customers their original recordings so they can use them with other companies in the future.

“If something better came out, wouldn't you want to be able to try it?” he asked. “And if you don't have access to those phrase sets, you can't.”

Brian Wallach, 42, a prominent ALS activist and former federal prosecutor who lives in Illinois, was diagnosed with ALS when he was 37, the same day his youngest daughter came home from the hospital. Over the years, his voice has transformed from forceful and clear to mumbled murmurs.

When he played his synthetic voice for the first time to his family, it was so accurate his wife burst into tears, he said. Meanwhile, his youngest daughter, who had never heard his voice pre-als, asked, “Is that you, Daddy?”

“I said back to her, `It is. My voice has changed a lot, but this is what I used to sound like,'” he said.

He likes his synthetic voice, but it doesn't pronounce his wife's name, Sandra, correctly. The synthetic voice also can't express the emotions he wants to convey when he talks to his two young daughters.

Typing what he wants to say with his synthetic voice is a slow process because the muscles in his hands have weakened.

Once ALS patients lose their ability to use their hands, they must use their eyes to type — slowing down conversation even further.

This was the case for Ruth Brunton, of Rogers, Ark. She was diagnosed with ALS in March 2021 and by Christmas that year, she lost her ability to speak. She banked her voice immediately after her diagnosis, but the company she worked with used unit selection technology, the older technology that can sound choppy or more robotic. Though she spent about a month recording 3,000 sentences, she wasn't happy with the result.

So, she was stuck using a generic voice with an American accent from Microsoft called “Heather.” But the voice failed to capture her soft-spoken British accent, which her husband jokingly called “posh” compared to his thick “scouse” accent from Liverpool.

In the voice of “Heather,” Ruth, a pragmatic, strong-willed person who was once the CEO of a non-profit that helped struggling families, started to retreat into a shell, said her husband, David Brunton. Their flirtatious banter stopped completely, and Ruth participated less in group conversations. Even saying “I love you” seemed to mean less in a voice that wasn't her own.

“She was talking because she had to, not because she wanted to,” he said.

After six months, they tried again — Ruth was able to get her original recordings back and gave them to a different company that used AI technology. Upon hearing the new voice, both Ruth and David got emotional — it felt like a part of Ruth had come back.

“I was taken aback how much it meant to me to have a voice that actually sounded like me,” Ruth said. “It may sound silly, but having my own voice increased my self-confidence.”

Small things they took for granted before, like having quiet chats together before bed, or reading books to their five grandchildren, took on a new significance.

Shortly after Christmas, Ruth got COVID, and on the morning of Feb. 10, nine days before their 40th wedding anniversary, she died. David held her hand all night.

“We've had two years of saying goodbye,” David said. “We agreed we were going to leave nothing unsaid.”

I was taken aback how much it meant to me to have a voice that actually sounded like me. It may sound silly, but having my own voice increased my self-confidence.

?? DAWN BOTTOMS/THE WASHINGTON POST
SEPTEMBER ?? David Brunton holds his wife Ruth's dress at a lake they visited, in Rogers, Ark., in March. Ruth died in February. — DAWN BOTTOMS/THE WASHINGTON POST SEPTEMBER David Brunton holds his wife Ruth's dress at a lake they visited, in Rogers, Ark., in March. Ruth died in February.

?? MICHAELA VATCHEVA/THE WASHINGTON POST ?? Anna Paula Pereira Hülle Mateus, centre, spends time with daughters Isadora, left, and Heloisa in their home in Lafayette, Calif. — MICHAELA VATCHEVA/THE WASHINGTON POST Anna Paula Pereira Hülle Mateus, centre, spends time with daughters Isadora, left, and Heloisa in their home in Lafayette, Calif.

?? ALEXA JULIANA ARD/THE WASHINGTON POST ?? Ronald Brady and his wife, Carla Hill Brady, moved to Mazatlán, Mexico, after his ALS diagnosis. — ALEXA JULIANA ARD/THE WASHINGTON POST Ronald Brady and his wife, Carla Hill Brady, moved to Mazatlán, Mexico, after his ALS diagnosis.

?? SAM PAAKKONEN/THE WASHINGTON POST ?? To use his synthetic voice, Brian Wallach begins by typing what he wants to say. — SAM PAAKKONEN/THE WASHINGTON POST To use his synthetic voice, Brian Wallach begins by typing what he wants to say.

?? SAM PAAKKONEN/THE WASHINGTON POST ?? Brian Wallach sits with his wife, Sandra Abrevaya, and their dog, Moon. — SAM PAAKKONEN/THE WASHINGTON POST Brian Wallach sits with his wife, Sandra Abrevaya, and their dog, Moon.

IN THEIR OWN WORDS

Patients were told their voices could disappear. They turned to AI to save them

Newspapers in English

Newspapers from Canada