Is big data laun­der­ing hu­man bias?

The Washington Post Sunday - - SUNDAY OPINION - BY LAU­REL ECKHOUSE

Big data has ex­panded to the crim­i­nal jus­tice sys­tem. In Los An­ge­les, po­lice use com­put­er­ized “pre­dic­tive polic­ing” to an­tic­i­pate crimes and al­lo­cate of­fi­cers. In Fort Laud­erdale, Fla., ma­chine­learn­ing al­go­rithms are used to set bond amounts. In states across the coun­try, data-driven es­ti­mates of the risk of re­cidi­vism are be­ing used to set jail sen­tences.

Ad­vo­cates say these data-driven tools re­move hu­man bias from the sys­tem, making it more fair as well as more ef­fec­tive. But even as they have be­come wide­spread, we have lit­tle in­for­ma­tion about ex­actly how they work. Few of the or­ga­ni­za­tions pro­duc­ing them have re­leased the data and al­go­rithms they use to de­ter­mine risk.

We need to know more, be­cause it’s clear that such sys­tems face a fun­da­men­tal prob­lem: The data they rely on are col­lected by a crim­i­nal jus­tice sys­tem in which race makes a big dif­fer­ence in the prob­a­bil­ity of ar­rest — even for peo­ple who be­have iden­ti­cally. In­puts de­rived from bi­ased polic­ing will in­evitably make black and Latino de­fen­dants look riskier than white de­fen­dants to a com­puter. As a re­sult, datadriven de­ci­sion-making risks ex­ac­er­bat­ing, rather than elim­i­nat­ing, racial bias in crim­i­nal jus­tice.

Con­sider a judge tasked with making a de­ci­sion about bail for two de­fen­dants, one black and one white. Our two de­fen­dants have be­haved in ex­actly the same way prior to their ar­rest: They used drugs in the same amount, have com­mit­ted the same traf­fic of­fenses, owned sim­i­lar homes and took their two chil­dren to the same school ev­ery morn­ing. But the crim­i­nal jus­tice al­go­rithms do not rely on all of a de­fen­dant’s prior ac­tions to reach a bail as­sess­ment — just those ac­tions for which he or she has been pre­vi­ously ar­rested and con­victed. Be­cause of racial bi­ases in ar­rest and con­vic­tion rates, the black de­fen­dant is more likely to have a prior con­vic­tion than the white one, de­spite iden­ti­cal con­duct. A risk as­sess­ment re­ly­ing on racially com­pro­mised crim­i­nal-his­tory data will un­fairly rate the black de­fen­dant as riskier than the white de­fen­dant.

To make mat­ters worse, riskassess­ment tools typ­i­cally eval­u­ate their suc­cess in pre­dict­ing a de­fen­dant’s dan­ger­ous­ness on re­ar­rests — not on de­fen­dants’ over­all be­hav­ior af­ter re­lease. If our two de­fen­dants re­turn to the same neigh­bor­hood and con­tinue their iden­ti­cal lives, the black de­fen­dant is more likely to be ar­rested. Thus, the tool will falsely ap­pear to pre­dict dan­ger­ous­ness ef­fec­tively, be­cause the en­tire process is cir­cu­lar: Racial dis­par­i­ties in ar­rests bias both the predictions and the jus­ti­fi­ca­tion for those predictions.

We know that a black per­son and a white per­son are not equally likely to be stopped by po­lice: Ev­i­dence on New York’s stop-and-frisk pol­icy, in­ves­ti­ga­tory stops, ve­hi­cle searches and drug ar­rests show that black and Latino civil­ians are more likely to be stopped, searched and ar­rested than whites. In 2012, a white at­tor­ney spent days try­ing to get him­self ar­rested in Brook­lyn for car­ry­ing graf­fiti sten­cils and spray paint, a Class B mis­de­meanor. Even when po­lice saw him tag­ging the City Hall gateposts, they sped past him, ig­nor­ing a crime for which 3,598 peo­ple were ar­rested by the New York Po­lice Depart­ment the fol­low­ing year.

Be­fore adopt­ing risk-as­sess­ment tools in the ju­di­cial de­ci­sion-making process, ju­ris­dic­tions should de­mand that any tool be­ing im­ple­mented un­dergo a thor­ough and in­de­pen­dent peer-re­view process. We need more trans­parency and bet­ter data to learn whether these risk as­sess­ments have dis­parate im­pacts on de­fen­dants of dif­fer­ent races. Foun­da­tions and or­ga­ni­za­tions de­vel­op­ing risk-as­sess­ment tools should be will­ing to re­lease the data used to build these tools to re­searchers to eval­u­ate their tech­niques for in­ter­nal racial bias and prob­lems of sta­tis­ti­cal in­ter­pre­ta­tion. Even bet­ter, with mul­ti­ple sources of data, re­searchers could iden­tify bi­ases in data gen­er­ated by the crim­i­nal jus­tice sys­tem be­fore the data is used to make de­ci­sions about lib­erty. Un­for­tu­nately, pro­duc­ers of risk-as­sess­ment tools — even non­profit or­ga­ni­za­tions — have not vol­un­tar­ily re­leased anonymized data and com­pu­ta­tional de­tails to other re­searchers, as is now standard in quan­ti­ta­tive so­cial sci­ence re­search.

For these tools to make racially un­bi­ased predictions, they must use racially un­bi­ased data. We can­not trust the cur­rent risk-as­sess­ment tools to make im­por­tant de­ci­sions about our neigh­bors’ lib­erty un­less we be­lieve — con­trary to so­cial sci­ence re­search — that data on ar­rests of­fer an ac­cu­rate and un­bi­ased rep­re­sen­ta­tion of be­hav­ior. Rather than telling us some­thing new, these tools risk laun­der­ing bias: us­ing bi­ased his­tory to pre­dict a bi­ased fu­ture. The writer is a re­searcher with the Hu­man Rights Data Anal­y­sis Group’s Polic­ing Project and a doc­toral can­di­date in po­lit­i­cal sci­ence at the University of Cal­i­for­nia at Berke­ley.

Newspapers in English

Newspapers from USA

© PressReader. All rights reserved.