Bangkok Post

Are AI standards too white?

There a racial divide speech-recognitio­n systems, researches say

- CADE METZ NYT © 2020 THE NEW YORK TIMES COMPANY

With an iPhone, you can dictate a text message. Put Amazon’s Alexa on your coffee table, and you can request a song from across the room.

But these devices may understand some voices better than others. Speech recognitio­n systems from five of the world’s biggest tech companies — Amazon, Apple, Google, IBM and Microsoft — make far fewer errors with users who are white than with users who are black, according to a study

published in the journal Proceeding­s Of The National Academy Of Sciences.

The systems misidentif­ied words about 19% of the time with white people. With black people, mistakes jumped to 35%. About 2% of audio snippets from white people were considered unreadable by these systems, according to the study, which was conducted by researcher­s at Stanford University. That rose to 20% with black people.

The study, which took an unusually comprehens­ive approach to measuring bias in speech recognitio­n systems, offers another cautionary sign for AI technologi­es rapidly moving into everyday life.

Other studies have shown that as facial recognitio­n systems move into police department­s and other government agencies, they can be far less accurate when trying to identify women and people of colour. Separate tests have uncovered sexist and racist behaviour in chatbots, translatio­n services and other systems designed to process and mimic written and spoken language.

“I don’t understand why there is not more due diligence from these companies before these technologi­es are released,” said Ravi Shroff, a professor of statistics at New York University who explores bias and discrimina­tion in new technologi­es.

“I don’t understand why we keep seeing these problems.”

All these systems learn by analysing vast amounts of data. Facial recognitio­n systems, for instance, learn by identifyin­g patterns in thousands of digital images of faces.

In many cases, the systems mimic the biases they find in the data, similar to children picking up bad habits from their parents. Chatbots, for example, learn by analysing reams of human dialogue. If this dialogue associates women with housework and men with CEO jobs, the chatbots will do the same.

The Stanford study indicated that leading speech recognitio­n systems could be flawed because companies are training the technology on data that is not as diverse as it could be — learning their task mostly from white people, and relatively few black people.

“Here are probably the five biggest companies doing speech recognitio­n, and they are all making the same kind of mistake,” said John Rickford, one of the Stanford researcher­s behind the study, who specialise­s in African American speech.

“The assumption is that all ethnic groups are well represente­d by these companies. But they are not.”

The study tested five publicly available tools from Apple, Amazon, Google, IBM and Microsoft that anyone can use to build speech recognitio­n services. These tools are not necessaril­y what Apple uses to build Siri or Amazon uses to build Alexa. But they may share underlying technology and practices with services like Siri and Alexa.

Each tool was tested last year, in late May and early June, and they may operate differentl­y now. The study also points out that when the tools were tested, Apple’s tool was set up differentl­y from the others and required some additional engineerin­g before it could be tested.

Apple and Microsoft declined to comment on the study. An Amazon spokeswome­n pointed to a webpage where the company says it is constantly improving its speech recognitio­n services. IBM did not respond to requests for comment.

Justin Burr, a Google spokesman, said the company was committed to improving accuracy. “We’ve been working on the challenge of accurately recognisin­g variations of speech for several years, and will continue to do so,” he said.

The researcher­s used these systems to transcribe interviews with 42 people who were white and 73 who were black. Then they compared the results from each group, showing a significan­tly higher error rate with the people who were black.

The best performing system, from Microsoft, misidentif­ied about 15% of words from white people and 27% from black people. Apple’s system, the lowest performer, failed 23% of the time with white people and 45% of the time with black people.

Based in a largely African American rural community in eastern North Carolina, a midsize city in western New York and Washington, DC, the black testers spoke in what linguists call African American Vernacular English — a variety of English sometimes spoken by African Americans in urban areas and other parts of the United States. The white people were in California, some in the state capital, Sacramento, and others from a rural and largely white area about 500km away.

The study found that the “race gap” was just as large when comparing the identical phrases uttered by both black and white people. This indicates that the problem lies in the way the systems are trained to recognise sound. The companies, it seems, are not training on enough data that represents African American Vernacular English, according to the researcher­s.

“The results are not isolated to one specific firm,” said Sharad Goel, a professor of engineerin­g at Stanford and another researcher involved in the study. “We saw qualitativ­ely similar patterns across all five firms.”

The companies are aware of the problem. In 2014, for example, Google researcher­s published a paper describing bias in an earlier breed of speech recognitio­n.

Companies like Google may have trouble gathering the right data, and they may not be motivated enough to gather it. “This is difficult to fix,” said Brendan O’Connor, a professor at the University of Massachuse­tts Amherst who specialise­s in AI technologi­es. “The data is hard to collect. You are fighting an uphill battle.”

The companies may face a chickenand-egg problem. If their services are used mostly by white people, they will have trouble gathering data that can serve black people. And if they have trouble gathering this data, the services will continue to be used mostly by white people.

“Those feedback loops are kind of scary when you start thinking about them,” said Noah Smith, a professor at the University of Washington. “That is a major concern.”

The study offers another cautionary sign for AI technologi­es rapidly moving into everyday life

 ??  ?? The Google Assistant stand at the Consumer Electronic­s Show in Las Vegas last year.
Amazon Echo.
The Google Assistant stand at the Consumer Electronic­s Show in Las Vegas last year. Amazon Echo.
 ??  ??

Newspapers in English

Newspapers from Thailand