Bad Code

Ar­ti­fi­cial in­tel­li­gence in­flu­ences ev­ery­thing from hir­ing de­ci­sions to loan ap­provals — but it can be as racist and sex­ist as hu­mans are

The Walrus - - CONTENTS - by Danielle Groen

Ar­ti­fi­cial in­tel­li­gence is in­flu­enc­ing ev­ery­thing from hir­ing de­ci­sions to loan ap­provals—but it can be as racist and sex­ist as hu­mans are

On a fall morn­ing in 2010, sit­ting at the kitchen ta­ble of her Illi­nois home, Safiya Umoja No­ble plugged a pair of words into Google. She was get­ting ready for a sleep­over with her four­teen-year-old step­daugh­ter, whom she likes to call her bonus daugh­ter, as well as five young nieces. Want­ing to nudge them away from their cell­phones, but wary that the girls could head straight for her lap­top, No­ble checked to see what they might find. “I thought to search for ‘black girls,’ be­cause I was think­ing of them — they were my favourite group of lit­tle black girls,” she says. But that in­nocu­ous search in­stantly re­sulted in some­thing alarm­ing: a page filled with links to ex­plicit porn. Back then, any­one search­ing for black girls would have found the same. “It was dis­ap­point­ing, to say the least, to find that al­most all the re­sults of racial­ized girls were pri­mar­ily rep­re­sented by hy­per­sex­u­al­ized con­tent,” says No­ble, now a com­mu­ni­ca­tions pro­fes­sor at the Uni­ver­sity of South­ern Cal­i­for­nia. “I just put the com­puter away and hoped none of the girls would ask to play with it.” Around the same time, in an­other part of the coun­try, com­puter sci­en­tist Joy Buo­lamwini dis­cov­ered a dif­fer­ent prob­lem of rep­re­sen­ta­tion. The Canada-born daugh­ter of par­ents from Ghana, she re­al­ized that ad­vanced fa­cial-recog­ni­tion sys­tems, such as the ones used by IBM and Mi­crosoft, strug­gled hard to de­tect her dark skin. Some­times, the pro­grams couldn’t tell she was there at all. Then a stu­dent at the Ge­or­gia In­sti­tute of Tech­nol­ogy, she was work­ing on a robotics project, only to find that the robot, which was sup­posed to play peek­a­boo with hu­man users, couldn’t make her out. She com­pleted the project by re­ly­ing on her room­mate’s light-skinned face. In 2011, at a startup in Hong Kong, she tried her luck with an­other robot — same re­sult. Four years later, as a grad­u­ate re­searcher at MIT, she found that the lat­est com­puter soft­ware still couldn’t see her. But when Buo­lamwini slipped on a white mask — the sort of nov­elty item you could buy at a Wal­mart for Hal­loween — the tech­nol­ogy worked swim­mingly. She fin­ished her project cod­ing in dis­guise. Fa­cial recog­ni­tion and search en­gines are just two feats of ar­ti­fi­cial in­tel­li­gence, the sci­en­tific dis­ci­pline that trains com­put­ers to per­form tasks char­ac­ter­is­tic of hu­man brains in­volv­ing, among other things, math, logic, lan­guage, vi­sion, and mo­tor skills. (In­tel­li­gence, like pornog­ra­phy, re­sists easy def­i­ni­tion; sci­en­tists and en­gi­neers haven’t honed in on a

sin­gle, com­mon-use de­scrip­tion, but they know it when they see it.) Self-driv­ing cars may not quite be ready to prowl city streets, but vir­tual as­sis­tants, such as Alexa, are all set to book a mid­day meet­ing at that cof­fee place you like. Im­prove­ments in lan­guage pro­cess­ing mean that you can read a trans­lated Rus­sian news­pa­per ar­ti­cle in English on your phone. Rec­om­mender sys­tems are ter­rif­i­cally adept at se­lect­ing mu­sic geared specif­i­cally to your taste or sug­gest­ing a Net­flix series for you to binge over a week­end. These aren’t the only ar­eas where AI sys­tems step in to make as­sess­ments that af­fect our lives. In some in­stances, what’s at stake is only our time: when you call for sup­port from a bank, for ex­am­ple, your place­ment on the wait-list may not be se­quen­tial but con­tin­gent on your value to it as a cus­tomer. (If the bank thought your in­vest­ment port­fo­lio was more promis­ing, you might only be on hold for three min­utes, in­stead of eleven.) But AI also in­creas­ingly in­flu­ences our po­ten­tial em­ploy­ment, our ac­cess to re­sources, and our health. Ap­pli­cant-track­ing sys­tems scan re­sumés for key­words in or­der to short­list can­di­dates for a hir­ing man­ager. Al­go­rithms — the set of in­struc­tions that tell a com­puter what to do — cur­rently eval­u­ate who qual­i­fies for a loan and who is in­ves­ti­gated for fraud. Risk-pre­dic­tion mod­els, in­clud­ing the one used in sev­eral Que­bec hos­pi­tals, iden­tify which pa­tients are like­li­est to be read­mit­ted within forty-five days and would ben­e­fit most from dis­charge or tran­si­tional care. AI in­forms where lo­cal polic­ing and fed­eral se­cu­rity are headed as well. In March 2017, the Canada Bor­der Ser­vices Agency an­nounced it would im­ple­ment fa­cial-match­ing soft­ware in its busiest in­ter­na­tional air­ports; kiosks in sev­eral lo­ca­tions, from Van­cou­ver to Ot­tawa to Hal­i­fax, now use the sys­tem to con­firm iden­ti­ties with pass­ports and, ac­cord­ing to a Gov­ern­ment of Canada ten­der, “pro­vide au­to­mated trav­eller risk as­sess­ment.” Cal­gary po­lice have used fa­cial recog­ni­tion to com­pare video sur­veil­lance with mug shots since 2014, and last fall, the Toronto Po­lice Ser­vices Board de­clared it would put part of its $18.9 mil­lion Polic­ing Ef­fec­tive­ness and Mod­ern­iza­tion grant to­ward im­ple­ment­ing sim­i­lar tech­nol­ogy. And while tra­di­tional polic­ing re­sponds to crimes that have ac­tu­ally oc­curred, pre­dic­tive polic­ing re­lies on his­tor­i­cal pat­terns and sta­tis­ti­cal mod­el­ling, in part to fore­cast which neigh­bour­hoods are at a higher risk of crime and then di­rect squad cars to those hot spots. Ma­jor US ju­ris­dic­tions have al­ready rolled out the soft­ware, and last sum­mer, Van­cou­ver be­came the first Cana­dian city to do the same. These tech­nolo­gies are prized for their ef­fi­ciency, cost- ef­fec­tive­ness,

AI sys­tems are only as clever as the data on which they’re trained, which means that our lim­i­ta­tions be­come theirs as well.

and scal­a­bil­ity — and for their prom­ise of neu­tral­ity. “A sta­tis­ti­cal sys­tem has an aura of ob­jec­tiv­ity and au­thor­ity,” says Kathryn Hume, the vice-pres­i­dent of prod­uct and strat­egy for In­te­grate.ai, an Ai-fo­cused startup in Toronto. While hu­man de­ci­sion­mak­ing can be messy, un­pre­dictable, and gov­erned by emo­tions or how many hours have passed since lunch, “data-driven al­go­rithms sug­gest a fu­ture that’s im­mune to sub­jec­tiv­ity or bias. But it’s not at all that sim­ple.” Ar­ti­fi­cial in­tel­li­gence may have cracked the code on cer­tain tasks that typ­i­cally re­quire hu­man smarts, but in or­der to learn, these al­go­rithms need vast quan­ti­ties of data that hu­mans have pro­duced. They hoover up that in­for­ma­tion, rum­mage around in search of com­mon­al­i­ties and cor­re­la­tions, and then of­fer a clas­si­fi­ca­tion or pre­dic­tion (whether that le­sion is can­cer­ous, whether you’ll de­fault on your loan) based on the pat­terns they de­tect. Yet they’re only as clever as the data they’re trained on, which means that our lim­i­ta­tions — our bi­ases, our blind spots, our inat­ten­tion — be­come theirs as well. Ear­lier this year, Buo­lamwini and a col­league pub­lished the re­sults of tests on three lead­ing fa­cial-recog­ni­tion pro­grams (cre­ated by Mi­crosoft, IBM, and Face++) for their abil­ity to iden­tify the gen­der of peo­ple with dif­fer­ent skin tones. More than 99 per­cent of the time, the sys­tems cor­rectly iden­ti­fied a lighter-skinned man. But that’s no great feat when data sets skew heav­ily to­ward white men; in an­other widely used data set, the train­ing pho­tos used to make iden­ti­fi­ca­tions are of a group that’s 78 per­cent male and 84 per­cent white. When Buo­lamwini tested the fa­cial-recog­ni­tion pro­grams on pho­to­graphs of black women, the al­go­rithm made mis­takes nearly 34 per­cent of the time. And the darker the skin, the worse the pro­grams per­formed, with er­ror rates hov­er­ing around 47 per­cent — the equiv­a­lent of a coin toss. The sys­tems didn’t know a black woman when they saw one. Buo­lamwini was able to cal­cu­late these re­sults be­cause the fa­cial-recog­ni­tion pro­grams were pub­licly avail­able; she could then test them on her own col­lec­tion of 1,270 im­ages, made up of politi­cians from African and Nordic coun­tries with high rates of women in of­fice, to see how the pro­grams per­formed. It was a rare op­por­tu­nity to eval­u­ate why the tech­nol­ogy failed in some of its pre­dic­tions. But trans­parency is the ex­cep­tion, not the rule. The ma­jor­ity of AI sys­tems used in com­mer­cial ap­pli­ca­tions — the ones that me­di­ate our ac­cess to ser­vices like jobs, credit, and loans — are pro­pri­etary, their al­go­rithms and train­ing data kept hid­den from pub­lic view. That makes it ex­cep­tion­ally dif­fi­cult for an in­di­vid­ual to in­ter­ro­gate the de­ci­sions of a ma­chine or to know when an al­go­rithm, trained on his­tor­i­cal ex­am­ples check­ered by hu­man bias, is stacked against them. And for­get about try­ing to prove that AI sys­tems may be vi­o­lat­ing hu­man rights leg­is­la­tion. “Most al­go­rithms are a black box,” says Ian Kerr, Canada Re­search Chair in ethics, law, and tech­nol­ogy. In part, that’s be­cause com­pa­nies will use gov­ern­ment or trade-se­crets laws

to keep their al­go­rithms ob­scure. But he adds that “even if there was per­fect trans­parency by the or­ga­ni­za­tion, it may just be that the al­go­rithms or AI are un­ex­plain­able or in­scrutable.” “Peo­ple or­ga­nized for civil rights and non-dis­crim­i­na­tory lend­ing prac­tices and went to court un­der the pro­tec­tions of the law to try and change those prac­tices,” says No­ble, who re­cently pub­lished a book called Al­go­rithms of Op­pres­sion. “Now we have sim­i­lar kinds of de­ci­sion-mak­ing that’s dis­crim­i­na­tory, only it’s done by al­go­rithms that are dif­fi­cult to un­der­stand — and that you can’t take to court. We’re in­creas­ingly re­duced to scores and de­ci­sions by sys­tems that are very much the prod­uct of hu­man be­ings but from which the hu­man be­ing is in­creas­ingly in­vis­i­ble.”

If you want to build an in­tel­li­gent ma­chine, it’s not a bad idea to start by min­ing the ex­per­tise of an in­tel­li­gent per­son. Back in the 1980s, de­vel­op­ers made an early AI break­through with so­called ex­pert sys­tems, in which a learned di­ag­nos­ti­cian or me­chan­i­cal en­gi­neer helped de­sign code to solve a par­tic­u­lar prob­lem. Think of how a ther­mo­stat works: it can be in­structed through a series of rules to keep a house at a des­ig­nated tem­per­a­ture or to blast warm air when a per­son en­ters the room. That seems pretty nifty, but it’s a trick of those rules and sen­sors—if [tem­per­a­ture drops be­low X] then [crank heat to Y]. The ther­mo­stat hasn’t learned any­thing mean­ing­ful about cold fronts or af­ter-work sched­ules; it can’t adapt its be­hav­iour. Ma­chine learn­ing, on the other hand, is a branch of ar­ti­fi­cial in­tel­li­gence that teaches a com­puter to per­form tasks by an­a­lyz­ing pat­terns, rather than sys­tem­at­i­cally ap­ply­ing rules it’s been given. Most of­ten, that’s done with a tech­nique called su­per­vised learn­ing. Hu­mans aren’t off the hook yet: a pro­gram­mer has to as­sem­ble her data, known as the in­puts, and as­sign it la­bels, known as the out­puts, so the sys­tem is told what to look for. Say our com­puter sci­en­tist would like to build some sort of fruit salad ob­ject-recog­ni­tion sys­tem that sep­a­rates straw­ber­ries (a wor­thy ad­di­tion to any fruit salad) from ba­nanas (ut­ter waste of space). She needs to se­lect fea­tures — let’s go with colour and shape — that are highly cor­re­lated with the fruit so the ma­chine can dis­tin­guish be­tween them. She la­bels im­ages of ob­jects that are red and round straw­ber­ries and those that are yel­low and long ba­nanas, and then she writes some code that as­signs one value to rep­re­sent colour and an­other to char­ac­ter­ize shape. She feeds a mess of pho­tos of straw­ber­ries and ba­nanas into the ma­chine, which builds up an un­der­stand­ing of the re­la­tion­ship

“Data-driven al­go­rithms sug­gest a fu­ture that’s im­mune to sub­jec­tiv­ity or bias. But it’s not all that sim­ple.”

be­tween these fea­tures, en­abling it to make ed­u­cated guesses about what kind of fruit it’s look­ing at. The sys­tem isn’t go­ing to be es­pe­cially good when it starts out; it needs to learn from a ro­bust set of ex­am­ples. Our su­per­vi­sor, the com­puter sci­en­tist, knows that this par­tic­u­lar in­put is a straw­berry, so if the pro­gram se­lects the ba­nana out­put, she pe­nal­izes it for a wrong an­swer. Based on this new in­for­ma­tion, the sys­tem ad­justs the con­nec­tion it made be­tween fea­tures in or­der to im­prove its pre­dic­tion the next time. Quickly — be­cause a pair of out­puts isn’t much of a sweat — the ma­chine will be able to look at a straw­berry or a ba­nana it hasn’t seen and ac­cu­rately iden­tify it. “Some things are easy to con­cep­tu­al­ize and write soft­ware for,” says Gra­ham Tay­lor, who leads the ma­chine­learn­ing re­search group at the Uni­ver­sity of Guelph. But maybe you want a sys­tem ca­pa­ble of rec­og­niz­ing more com­plex ob­jects than as­sorted fruit. Maybe you want to iden­tify a par­tic­u­lar face among a sea of faces. “That’s where deep learn­ing comes in,” Tay­lor says. “It scales to re­ally large data sets, solves prob­lems quickly, and isn’t lim­ited to the knowl­edge of the ex­pert who de­fines the rules.” Deep learn­ing is the pro­foundly buzzy branch of ma­chine learn­ing in­spired by the way our brains work. Put very sim­ply, a brain is a col­lec­tion of bil­lions of neu­rons con­nected by tril­lions of synapses, and the rel­a­tive strength of those con­nec­tions — like the one be­tween red and red fruit, and the one be­tween red fruit and straw­berry — be­comes ad­justed through learn­ing pro­cesses over time. Deep-learn­ing sys­tems rely on an elec­tronic model of this neu­ral net­work. “In your brain, neu­rons send lit­tle wires that carry in­for­ma­tion to other neu­rons,” says Yoshua Ben­gio, a Mon­treal com­puter sci­en­tist and one of the pi­o­neers of deep learn­ing. The strength or weak­ness of the sig­nal be­tween a pair of neu­rons is known as the synap­tic weight: when the weight is large, one neu­ron ex­erts a pow­er­ful in­flu­ence on an­other; when it’s small, so is the in­flu­ence. “By chang­ing those weights, the strength of the con­nec­tion be­tween dif­fer­ent neu­rons changes,” he says. “That’s the clue that AI re­searchers have taken when they em­barked on the idea of train­ing these ar­ti­fi­cial neu­ral net­works.” And that’s one way AI can ad­vance from sort­ing straw­ber­ries and ba­nanas to rec­og­niz­ing faces. A com­puter sci­en­tist sup­plies the la­belled data — all those par­tic­u­lar faces at­tached to their cor­rect names. But rather than re­quir­ing she tell it what fea­tures in the pho­to­graphs are im­por­tant for iden­ti­fi­ca­tion, the com­puter ex­tracts that in­for­ma­tion en­tirely on its own. “Your in­put here is the image of the face, and then the out­put is the de­ci­sion of who that per­son is,” Tay­lor says. In or­der to make the jour­ney from in­put to out­put, the image un­der­goes sev­eral trans­for­ma­tions. “It might be trans­formed first into a very low-level rep­re­sen­ta­tion, just enu­mer­at­ing the types of edges and where they are,” he says. The next rep­re­sen­ta­tion might

Men were six times more likely than women to be shown Google ad­ver­tise­ments for jobs with salaries up­wards of $200,000.

be cor­ners and the junc­tions of those edges, then pat­terns of edges that make up a shape. A cou­ple of cir­cles could turn out to be an eye. “Each layer of rep­re­sen­ta­tion is a dif­fer­ent level of ab­strac­tion in terms of fea­tures,” Tay­lor ex­plains, “un­til you get to the very high­level fea­tures, things that are start­ing to look like an iden­tity—hair­style and jaw­line— or prop­er­ties like fa­cial sym­me­try.” How does this whole process hap­pen? Num­bers. A mind-bog­gling quan­tity of num­bers. A fa­cial-recog­ni­tion sys­tem, for ex­am­ple, will an­a­lyze an image based on the in­di­vid­ual pix­els that make it up. (A megapixel cam­era uses a 1,000-by1,000-pixel grid, and each pixel has a red, green, and blue value, an in­te­ger be­tween zero and 255, that tells you how much of the colour is dis­played.) The sys­tem an­a­lyzes the pix­els through those lay­ers of rep­re­sen­ta­tion, build­ing up ab­strac­tion un­til it ar­rives at an iden­ti­fi­ca­tion all by it­self. But wait: while this face is clearly Christo­pher Plum­mer, the ma­chine thinks it’s ac­tu­ally Mar­garet Trudeau. “The model does very, very poorly in the be­gin­ning,” Tay­lor says. “We can start by show­ing it im­ages and ask­ing who’s in the im­ages, but be­fore it’s trained or done any learn­ing, it gets the an­swer wrong all the time.” That’s be­cause, be­fore the al­go­rithm gets go­ing, the weights be­tween the ar­ti­fi­cial neu­rons in the net­work are ran­domly set. Through a grad­ual process of trial and er­ror, the sys­tem tweaks the strength of con­nec­tions be­tween dif­fer­ent lay­ers, so when it is pre­sented with an­other pic­ture of Christo­pher Plum­mer, it does a lit­tle bet­ter. Small ad­just­ments give rise to slightly bet­ter con­nec­tions and slightly lower rates of er­ror, un­til even­tu­ally the sys­tem can iden­tify with high ac­cu­racy the cor­rect face. It’s this tech­nol­ogy that al­lows Face­book to alert you when you’ve been in­cluded in a pic­ture, even if you re­main un­tagged. “The cool thing about deep learn­ing is that we can ex­tract ev­ery­thing with­out hav­ing some­one say, ‘Oh, these fea­tures are use­ful for rec­og­niz­ing a par­tic­u­lar face,’” Tay­lor says. “That hap­pens au­to­mat­i­cally, and that’s what makes it magic.”

Here’s a TRICK: type CEO into Google Im­ages and watch as you con­jure an ar­ray of barely dis­tin­guish­able white male faces. If you’re in Canada, you’ll see sprin­kled among them a hand­ful of mostly white women, a few peo­ple of colour, and Won­der Woman’s Gal Gadot. In Cal­i­for­nia, at a ma­chine-learn­ing con­fer­ence late last year, a pre­sen­ter had to scroll through a size­able list of white dudes in dark suits be­fore she landed on the first image of a fe­male Ceo. It was Ceo Bar­bie. Data is es­sen­tial to the op­er­a­tion of an AI sys­tem. And the more com­pli­cated the sys­tem — the more lay­ers in the neu­ral nets, to trans­late speech or iden­tify faces or cal­cu­late the like­li­hood some­one de­faults on a loan — the more data must be col­lected. Pro­gram­mers might rely on stock pho­tos or Wikipedia en­tries, archived news ar­ti­cles or au­dio record­ings. They could look at the his­tory of uni­ver­sity ad­mis­sions and pa­role records. They want clin­i­cal stud­ies and credit scores. “Data is very, very im­por­tant,” says Doina Pre­cup, a pro­fes­sor at Mcgill’s School of Com­puter Sci­ence. The more data you have, “the bet­ter the so­lu­tion will be.” But not ev­ery­one will be equally rep­re­sented in that data. Some­times, this is a func­tion of his­tor­i­cal ex­clu­sion: In 2017, women rep­re­sented just 6.4 per­cent of For­tune 500 Ceos, which is a whop­ping 52 per­cent in­crease over the num­ber from the year be­fore. Health Canada didn’t ex­plic­itly re­quest women be in­cluded in clin­i­cal tri­als un­til 1997; ac­cord­ing to the Heart and Stroke Foun­da­tion’s 2018 Heart Re­port, two-thirds of heart dis­ease clin­i­cal re­search still fo­cuses on men, which helps ex­plain why a re­cent study found that symp­toms of heart dis­ease are missed in more than half of women. Since we know that women have been kept out of those C-suites and tri­als, it’s safe to as­sume that their ab­sence will skew the re­sults of any sys­tem trained on this data. And some­times, even when am­ple data ex­ists, those who build the train­ing sets don’t take de­lib­er­ate mea­sures to en­sure its di­ver­sity, lead­ing to a fa­cial­recog­ni­tion pro­gram — like the kind Buo­lamwini needed a nov­elty mask to fool — that has very dif­fer­ent er­ror rates for dif­fer­ent groups of peo­ple. The re­sult is some­thing called sampling bias, and it’s caused by a dearth of rep­re­sen­ta­tive data. An al­go­rithm is op­ti­mized to make as few mis­takes as it pos­si­bly can; the goal is to lower its num­ber of er­rors. But the com­po­si­tion of the data de­ter­mines where that al­go­rithm di­rects its at­ten­tion. To­ni­ann Pi­tassi, a Uni­ver­sity of Toronto com­puter sci­ence pro­fes­sor who fo­cuses on fair­ness in ma­chine learn­ing, of­fers the ex­am­ple of a school ad­mis­sion pro­gram. (Uni­ver­sity ad­mis­sions pro­cesses in Canada don’t yet rely on al­go­rithms, Tay­lor says, but data sci­en­tist Cathy O’neil has found ex­am­ples of Amer­i­can schools that use them.) “Say you have 5 per­cent black ap­pli­cants,” Pi­tassi says. “If 95 per­cent of the ap­pli­cants to that uni­ver­sity are white, then al­most all your data is go­ing to be white. The al­go­rithm is try­ing to min­i­mize how many mis­takes it makes over all that data, in terms of who should get into the uni­ver­sity. But it’s not go­ing to put much ef­fort into min­i­miz­ing er­rors in that 5 per­cent, be­cause it won’t af­fect the over­all er­ror rate.” “A lot of al­go­rithms are trained by see­ing how many an­swers they get cor­rect in the train­ing data,” ex­plains Suresh Venkata­sub­ra­ma­nian, a pro­fes­sor at the Uni­ver­sity of Utah’s School of Com­put­ing. “That’s fine, but if you just count up the an­swers, there’s a small group that you’re al­ways go­ing to make mis­takes on.

It won’t hurt you very much to do so, but be­cause you sys­tem­at­i­cally make mis­takes on all mem­bers of that tiny group, the im­pact of the er­ro­neous de­ci­sions is much more than if your er­rors were spread out across mul­ti­ple groups.” It’s for this rea­son that Buo­lamwini still found IBM’S fa­cial-recog­ni­tion tech­nol­ogy to be 87.9 per­cent ac­cu­rate. When a sys­tem is cor­rect 92.9 per­cent of the time on the light-skinned women in a data set and 99.7 per­cent of the time on the light-skinned men, it doesn’t mat­ter that it’s mak­ing mis­takes on nearly 35 per­cent of black women. Same with Mi­crosoft’s al­go­rithm, which she found was 93.7 per­cent ac­cu­rate at pre­dict­ing gen­der. Buo­lamwini showed that nearly the very same num­ber — 93.6 per­cent — of the gen­der er­rors were on the faces of dark-skinned sub­jects. But the al­go­rithm didn’t need to care.

SPend enough TIME in deep enough con­ver­sa­tion with ar­ti­fi­cial­in­tel­li­gence ex­perts, and at some point, they will all of­fer up the same ax­iom: garbage in, garbage out. It’s pos­si­ble to side­step sampling bias and en­sure that sys­tems are be­ing trained on a wealth of bal­anced data, but if that data comes weighted with our so­ci­ety’s prej­u­dices and dis­crim­i­na­tions, the al­go­rithm isn’t ex­actly bet­ter off. “What we want is data that is faith­ful to re­al­ity,” Pre­cup says. And when re­al­ity is bi­ased, “the al­go­rithm has no choice but to re­flect that bias. This is how al­go­rithms are for­mu­lated.” Oc­ca­sion­ally, the bias that’s re­flected is al­most comic in its pre­dictabil­ity. Web searches, chat bots, image-cap­tion­ing pro­grams, and ma­chine trans­la­tion in­creas in­gly rely on a tech­nique called word em­bed­dings. It works by

Newspapers in English

Newspapers from Canada

© PressReader. All rights reserved.