Are AI standards too white?

There a racial divide speech-recognition systems, researches say

With an iPhone, you can dictate a text message. Put Amazon’s Alexa on your coffee table, and you can request a song from across the room.

But these devices may understand some voices better than others. Speech recognition systems from five of the world’s biggest tech companies — Amazon, Apple, Google, IBM and Microsoft — make far fewer errors with users who are white than with users who are black, according to a study

published in the journal Proceedings Of The National Academy Of Sciences.

The systems misidentified words about 19% of the time with white people. With black people, mistakes jumped to 35%. About 2% of audio snippets from white people were considered unreadable by these systems, according to the study, which was conducted by researchers at Stanford University. That rose to 20% with black people.

The study, which took an unusually comprehensive approach to measuring bias in speech recognition systems, offers another cautionary sign for AI technologies rapidly moving into everyday life.

Other studies have shown that as facial recognition systems move into police departments and other government agencies, they can be far less accurate when trying to identify women and people of colour. Separate tests have uncovered sexist and racist behaviour in chatbots, translation services and other systems designed to process and mimic written and spoken language.

“I don’t understand why there is not more due diligence from these companies before these technologies are released,” said Ravi Shroff, a professor of statistics at New York University who explores bias and discrimination in new technologies.

“I don’t understand why we keep seeing these problems.”

All these systems learn by analysing vast amounts of data. Facial recognition systems, for instance, learn by identifying patterns in thousands of digital images of faces.

In many cases, the systems mimic the biases they find in the data, similar to children picking up bad habits from their parents. Chatbots, for example, learn by analysing reams of human dialogue. If this dialogue associates women with housework and men with CEO jobs, the chatbots will do the same.

The Stanford study indicated that leading speech recognition systems could be flawed because companies are training the technology on data that is not as diverse as it could be — learning their task mostly from white people, and relatively few black people.

“Here are probably the five biggest companies doing speech recognition, and they are all making the same kind of mistake,” said John Rickford, one of the Stanford researchers behind the study, who specialises in African American speech.

“The assumption is that all ethnic groups are well represented by these companies. But they are not.”

The study tested five publicly available tools from Apple, Amazon, Google, IBM and Microsoft that anyone can use to build speech recognition services. These tools are not necessarily what Apple uses to build Siri or Amazon uses to build Alexa. But they may share underlying technology and practices with services like Siri and Alexa.

Each tool was tested last year, in late May and early June, and they may operate differently now. The study also points out that when the tools were tested, Apple’s tool was set up differently from the others and required some additional engineering before it could be tested.

Apple and Microsoft declined to comment on the study. An Amazon spokeswomen pointed to a webpage where the company says it is constantly improving its speech recognition services. IBM did not respond to requests for comment.

Justin Burr, a Google spokesman, said the company was committed to improving accuracy. “We’ve been working on the challenge of accurately recognising variations of speech for several years, and will continue to do so,” he said.

The researchers used these systems to transcribe interviews with 42 people who were white and 73 who were black. Then they compared the results from each group, showing a significantly higher error rate with the people who were black.

The best performing system, from Microsoft, misidentified about 15% of words from white people and 27% from black people. Apple’s system, the lowest performer, failed 23% of the time with white people and 45% of the time with black people.

Based in a largely African American rural community in eastern North Carolina, a midsize city in western New York and Washington, DC, the black testers spoke in what linguists call African American Vernacular English — a variety of English sometimes spoken by African Americans in urban areas and other parts of the United States. The white people were in California, some in the state capital, Sacramento, and others from a rural and largely white area about 500km away.

The study found that the “race gap” was just as large when comparing the identical phrases uttered by both black and white people. This indicates that the problem lies in the way the systems are trained to recognise sound. The companies, it seems, are not training on enough data that represents African American Vernacular English, according to the researchers.

“The results are not isolated to one specific firm,” said Sharad Goel, a professor of engineering at Stanford and another researcher involved in the study. “We saw qualitatively similar patterns across all five firms.”

The companies are aware of the problem. In 2014, for example, Google researchers published a paper describing bias in an earlier breed of speech recognition.

Companies like Google may have trouble gathering the right data, and they may not be motivated enough to gather it. “This is difficult to fix,” said Brendan O’Connor, a professor at the University of Massachusetts Amherst who specialises in AI technologies. “The data is hard to collect. You are fighting an uphill battle.”

The companies may face a chickenand-egg problem. If their services are used mostly by white people, they will have trouble gathering data that can serve black people. And if they have trouble gathering this data, the services will continue to be used mostly by white people.

“Those feedback loops are kind of scary when you start thinking about them,” said Noah Smith, a professor at the University of Washington. “That is a major concern.”

The study offers another cautionary sign for AI technologies rapidly moving into everyday life

?? ?? The Google Assistant stand at the Consumer Electronics Show in Las Vegas last year.
Amazon Echo. — The Google Assistant stand at the Consumer Electronics Show in Las Vegas last year. Amazon Echo.

Are AI standards too white?

There a racial divide speech-recognition systems, researches say

Newspapers in English

Newspapers from Thailand

Are AI standards too white?

There a racial divide speech-recognitio­n systems, researches say

Newspapers in English

Newspapers from Thailand

There a racial divide speech-recognition systems, researches say