Amazon’s facial recognition technology biased?
Over the past two years, Amazon has aggressively marketed its facial recognition technology to police departments and federal agencies as a service to help law enforcement identify suspects more quickly. It has done so as another tech giant, Microsoft, has called on Congress to regulate the technology, arguing that it is too risky for companies to oversee on their own.
Now a new study from researchers at the MIT Media Lab has found that Amazon’s system, Rekognition, had much more difficulty in telling the gender of female faces and of darkerskinned faces in photos than similar services from IBM and Microsoft. The results raise questions about potential bias that could hamper Amazon’s drive to popularize the technology.
In the study, published Thursday, Rekognition made no errors in recognizing the gender of lighter-skinned men. But it misclassified women as men 19 percent of the time, the researchers said, and mistook darker-skinned women for men 31 percent of the time. Microsoft’s technology mistook darker-skinned women for men just 1.5 percent of the time.
A study published a year ago found similar problems in the programs built by IBM, Microsoft and Megvii, an artificial intelligence company in China known as Face++. Those results set off an outcry that was amplified when a co-author of the study, Joy Buolamwini, posted YouTube videos showing the technology misclassifying famous African-American women, like Michelle Obama, as men.
The companies in last year’s report all reacted by quickly releasing more accurate technology. For the latest study, Buolamwini said, she sent a letter with some preliminary results to Amazon seven months ago. But she said she hadn’t heard back from Amazon, and when she and a co-author retested the company’s product a couple of months later, it had not improved.
Matt Wood, general manager of artificial intelligence at Amazon Web Services, said the researchers had examined facial analysis — a technology that can spot features such as mustaches or expressions such as smiles — and not facial recognition, a technology that can match faces in photos or video stills to identify individuals. Amazon markets both services.
“It’s not possible to draw a conclusion on the accuracy of facial recognition for any use case — including law enforcement — based on results obtained using facial analysis,” Wood said in a statement. He added that the researchers had not tested the latest version of Rekognition, which was updated in November.
Proponents see facial recognition as an important advance in helping law enforcement agencies catch criminals and find missing children. Some police departments, and the FBI, have tested Amazon’s product.
But civil liberties experts warn that it can also be used to secretly identify people — potentially chilling Americans’ ability to speak freely or simply go about their business anonymously in public.
The study published last year reported that Microsoft had a perfect score in identifying the gender of lighter-skinned men in a photo database, but that it misclassified darker-skinned women as men about one in five times. IBM and Face++ had an even higher error rate, each misclassifying the gender of darker-skinned women about one in three times.
Buolamwini said she had developed her methodology with the idea of harnessing public pressure, and market competition, to push companies to fix biases in their software that could pose serious risks to people.
“One of the things we were trying to explore with the paper was how to galvanize action,” Buolamwini said.
Immediately after the study came out last year, IBM published a blog post, “Mitigating Bias in AI Models,” citing Buolamwini’s study. In the post, Ruchir Puri, chief architect at IBM Watson, said IBM had been working for months to reduce bias in its facial recognition system. The company post included test results showing improvements, particularly in classifying the gender of darkerskinned women. Soon after, IBM released a new system that the company said had a tenfold decrease in error rates.
A few months later, Microsoft published its own post, titled “Microsoft improves facial recognition technology to perform well across all skin tones, genders.” In particular, the company said, it had significantly reduced the error rates for female and darker-skinned faces.
Buolamwini wanted to learn whether the study had changed overall industry practices. So she and a colleague, Deborah Raji, a college student who did an internship at the MIT Media Lab last summer, conducted a new study.
In it, they retested the facial systems of IBM, Microsoft and Face++. They also tested the facial systems of two companies that were not included in the first study: Amazon and Kairos, a startup in Florida.
The new study found that IBM, Microsoft and Face++ improved their accuracy in identifying gender.
By contrast, the study reported, Amazon misclassified the gender of darker-skinned females 31 percent of the time, while Kairos had an error rate of 22.5 percent.
Melissa Doval, chief executive of Kairos, said the company, inspired by Buolamwini’s work, released a more accurate algorithm in October.
Buolamwini said the results of her studies raised fundamental questions for society about whether facial technology should not be used in certain situations, such as job interviews, or in products, like drones or police body cameras.
Some federal lawmakers are voicing similar issues.
“Technology like Amazon’s Rekognition should be used if and only if it is imbued with American values like the right to privacy and equal protection,” said Sen. Edward J. Markey, D-Mass., who has been investigating Amazon’s facial recognition practices. “I do not think that standard is currently being met.”