Experts tackle AI biases over gender and race
Some algorithms are trained to learn from its users and can pick up their biases
When Tim-nit Gebru was a student at Stanford University’s prestigious Artificial Intelligence Lab, she ran a project that used Google Street View images of cars to determine the demographic makeup of towns and cities across the U.S.
While the AI algorithms did a credible job of predicting income levels and political leanings in a given area, Gebru says her work was susceptible to bias — racial, gender, socio-economic. She was also horrified by a Pro-Publica report that found a computer program widely used to predict whether a criminal will reoffend discriminated against people of colour.
So this year, Gebru, 34, joined a Microsoft Corp. team called FATE — for Fairness, Accountability, Transparency and Ethics in AI. The program was set up three years ago to ferret out biases that creep into AI data and can skew results.
“I started to realize that I have to start thinking about things like bias,” says Gebru, who co-founded Black in AI, a group set up to encourage people of colour to join the artificial intelligence field. “Even my own PhD work suffers from whatever issues you’d have with data set bias.”
In the popular imagination, the threat from AI tends to the alarmist: self-aware computers turning on their creators and taking over the planet.
The reality (at least for now) turns out to be a lot more insidious but no less concerning to the people working in AI labs around the world. Companies, government agencies and hospitals are increasingly turning to machine learning, image recognition and other AI tools to help-predict everything from the credit worthiness of a loan applicant to the preferred treatment for a person suffering from cancer.
The tools have big blind spots that particularly effect women and minorities.
“The worry is if we don’t get this right, we could be making wrong decisions that have critical consequences to someone’s life, health or financial stability,” says Jeannette Wing, director of Columbia University’s Data Sciences Institute.
Researchers at Microsoft, International Business Machines Corp. and the University of Toronto identified the need for fairness in AI systems back in 2011.
Now, in the wake of high-profile incidents — including an AI beauty contest that chose predominantly white faces as winners — some of the best minds in the business are working on the bias problem.
AI is only as good as the data it learns from. Let’s say programmers are building a computer model to identify dog breeds from images. First, they train the algorithms with photos that are each tagged with breed names. Then they put the program through its paces with untagged photos of Fido and Rover and let the algorithms name the breed based on what they learned from the training data. The programmers fine-tune from there.
The algorithms continue to learn and improve, and with more time and data are supposed to become more accurate. Unless bias intrudes.
Bias can surface in various ways. Sometimes, the training data is insufficiently diverse, prompting the software to guess based on what it “knows.”
In 2015, Google’s photo software tagged two Black users as “gorillas” because the data lacked enough examples of people of colour.
Even when the data accurately mirrors reality, the algorithms still get the answer wrong, incorrectly guessing a particular nurse in a photo or text is female, say, because the data shows fewer men are nurses. In some cases, the algorithms are trained to learn from the people using the software and, over time, pick up the biases of the human users.
Google’s photo software tagged two Black users ‘gorillas’ because the data lacked enough examples of people of colour
AI also has a disconcertingly human habit of amplifying stereotypes. Eliminating bias isn’t easy. Improving the training data is one way. Scientists at Boston University and Microsoft’s New England lab zeroed in on so-called word embeddings — sets of data that serve as a kind of computer dictionary used by AI programs. In this case, the researchers were looking for gender bias that could lead algorithms to do things such as conclude people named John would make better programmers than ones named Mary.
In a paper called “Man is to Computer Programmer as Woman is to Homemaker?” the researchers explain how they combed through the data, keeping legitimate correlations (man is to king as woman is to queen, for one) and altering ones that were biased (man is to doctor as woman is to nurse). In doing so, they created a gender-bias-free public data set and are now working on one that removes racial biases.
“We have to teach our algorithms which are good associations and which are bad the same way we teach our kids,” says Adam Kalai, a Microsoft researcher who co-authored the paper.