Racism and machine learning

How computer algorithms can purposefully penalize

2018-09-23 - By Eric Siegel,

What if the data tell you to be racist? Without the right precautions, machine learning — the technology that drives risk assessment in law enforcement, as well as hiring and loan decisions — explicitly penalizes underprivileged groups. Left to its own devices, the algorithm will count a black defendant’s race as a strike against the person. Yet some data scientists are calling for turning off the safeguards and unleashing computerized prejudice, signaling an emerging threat that supersedes the well-known concerns about inadvertent machine bias.

Imagine sitting across from a person being evaluated for a job, a loan or even parole. When asked how the decision process works, you inform them, “For one thing, our algorithm penalized your score by seven points because you’re black.”

We’re already heading in that discriminatory direction — and this is all the more foreboding since Gov. Jerry Brown signed SB10 into law last month. This new law mandates a heavier reliance on algorithmic decisions for criminal defendants. In the meantime, distinguished experts are now campaigning for discriminatory algorithms in law enforcement and beyond. They argue that computers should be authorized to make lifealtering decisions based directly on race and other protected classes. This would mean that computers could explicitly penalize black defendants for being black.

In most cases, data scientists intentionally design algorithms to be blind to protected classes such as race, religion and gender. They implement this safeguard by prohibiting predictive models — which are the formulas that render momentous decisions such as pretrial release determinations — from considering such factors. But discriminatory practices threaten to infiltrate algorithmic decision-making.

I use “discriminatory” for decisions about individuals that are based in part on a protected class. For example, profiling by race or religion to determine police searches or extra airport security screening would be discriminatory. An exception would be when decisions are intended to benefit a protected group, such as for affirmative action, or when determining whether one qualifies for a grant given to members of a minority group.

Law enforcement is using predictive models more widely. SB10 eliminates cash bail and mandates that pretrial release decisions instead rest more heavily on predictive models generated automatically by machine learning. Several other states have also made moves in this direction.

Will such crime-risk models steer clear of discrimination? Although they usually avert discriminatory decisions by excluding protected classes from their inputs, there’s no guarantee they’ll stay that way.

Without due precautions, machine learning’s decisions meet the very definition of inequality. For example, for informing pretrial release, parole and sentencing decisions, the model calculates the probability (risk) of future criminal convictions. If the data links race to convictions — showing that black defendants have more than white defendants — then the resulting model will penalize the score for each black defendant, just for being black, unless race has been intentionally excluded from the model. There couldn’t be a more blatant case of criminalizing blackness.

Discriminatory decision-making by humans is pervasive, paving the way for discriminatory machine learning. Take, for example:

Screening Muslims. While the Trump administration has not attempted to implement a ban based explicitly on religion, many U.S. citizens voted for a president who ran on a campaign pledge to ban Muslims.

Transgender individuals banned from the military.

The lack of female players in certain big league sports indicates an intentional decision based on gender.

Hiring decisions. Resumes with “white-sounding” names receive 50 percent more responses than those with “African American-sounding” names.

Housing decisions.

Airbnb applications from guests with “distinctively African American names are 16 percent less likely to be accepted relative to identical guests with distinctively white names,” according to Harvard University researchers.

Racial profiling by law enforcement. Until the 1970s, the risk of future crime was estimated based largely on an individual’s race and national heritage. Although this has lessened, profiling by race and religion remains in fashion. Twenty states “do not explicitly prohibit racial profiling,” according to the NAACP, and U.S. Department of Justice policy allows federal agents to racially profile within the vicinity of the U.S. border.

Polls show 75 percent of Americans support increased airport security checks based in part on ethnicity and 25 percent of Americans support the use of racial profiling by police.

Discriminatory practices also threaten to infiltrate algorithms. A recent paper co-written by Stanford University Assistant Professor Sharad Goel — who holds appointments in two engineering departments as well as the sociology department — criticizes the standard that predictive models not be discriminatory. The paper recommends discriminatory decision-making when “protected traits add predictive value.”

In a related lecture, the Stanford professor said, “We can pretend like we don’t have the information, but it’s there . ... It’s actually good to include race in your algorithm.”

University of Pennsylvania criminology Professor Richard Berk — who has been commissioned by parole departments in Pennsylvania to build predictive models — also calls for discriminatory models. In a 2009 paper describing the application of machine learning to predict which convicts will kill or be killed while on probation or parole, he advocates for race-based prediction. “One can employ the best model, which for these data happens to include race as a

predictor. This is the most technically defensible position.”

Data mean power. They fuel machine learning and, generally, the more you have, the better the predictions. Data scientists see it time and time again: Introducing any new demographic or behavioral data will potentially improve your predictive model. In this way, some data sets may compel discrimination. It’s the ultimate rationale for prejudice. The data seem to tell you, “Be racist.”

But “obeying” the data and making discriminatory decisions violates the most essential notions of fairness and civil rights. Even if it is true that my group commits more crime, it would violate my rights to be held accountable for the others, to have my classification count against me. We must not penalize people for their identity.

Discriminatory computers wreak more havoc than humans manually implementing discriminatory policies. Once it is computerized — that is, once it’s crystallized as an algorithm — a discriminatory decision process executes automatically, coldly and on a more significant scale, affecting greater numbers of people. Formalized and deployed mechanically, it takes on a concrete, accepted status. It becomes the system. More than any human, the computer is “the Man.”

So get more data. Just as we human decision-makers would strive to see as much beyond race as we can about a job candidate or criminal suspect, making an analogous effort — on a larger scale — to widen the database will enable our computer to transcend discrimination as well. Resistance to investing in this effort would reveal a willingness to compromise this nation’s freedoms, the very freedoms we were trying to protect with policies and law enforcement in the first place.

Eric Siegel, Ph.D., is the founder of the Predictive Analytics World and Deep Learning World conference series, the author of “Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die,” and a former computer science professor at Columbia University. Follow him at @predictanalytic. To comment, submit your letter to the editor at SFChronicle.com/letters.

?? Tim Brinton / NewsArt ?? — Tim Brinton / NewsArt

Racism and machine learning

How computer algorithms can purposefully penalize

Newspapers in English

Newspapers from United States

Racism and machine learning

How computer algorithms can purposeful­ly penalize

Newspapers in English

Newspapers from United States

How computer algorithms can purposefully penalize