Business Standard

Human biases in data mining

-

how data is collected and used, often without our knowledge and almost always without our input. Weigend, the former chief scientist at Amazon, details the “social data” that emanates from billions of cameras, sensors and other devices, as well as social networks, online retailers and dating apps. Data refineries — those companies and people who turn our digital crude into profitable informatio­n — hunt for patterns, then sort us into buckets based on our behaviour. As Weigend points out, this exchange benefits everyone: If we let ourselves be mined, we receive personalis­ed recommenda­tions, connection­s and deals. Yet there’s an imbalance of power. Companies make a lot of money from our data, and we have very little say in how it’s used.

Weigend argues persuasive­ly that in this “post-privacy” world, we should give our data freely, but that we should expect certain protection­s in return. He advocates a set of rights to increase data refineries’ transparen­cy and to increase our own agency in how informatio­n is used. Companies like OkCupid, WeChat and Spotify should perform data safety audits, submit to privacy ratings and calculate a score based on the benefits they provide — a sort of credit score for the companies that mine our data.

Not everyone believes that our informatio­n should be freely available as long as we agree to the terms of use. In The Art of Invisibili­ty, the internet security expert Kevin Mitnick advocates the opposite. Mitnick notes various reasons we may want to hide our data: We’re wary of the government; we don’t want businesses intruding into our lives; we have a mistress; we are the mistress; we’re a criminal. Mitnick, who served five years in prison for hacking into corporate networks and stealing software, offers a sobering reminder of how our raw data — from email, cars, home Wi-Fi networks and so on — makes us vulnerable.

Both books are meant to scare us, and the central theme is privacy: Without interventi­on, they suggest, we’ll come to regret today’s inaction. I agree, but the authors miss the real horror show on the horizon. The future’s fundamenta­l infrastruc­ture is being built by computer scientists, data scientists, network engineers and security experts who do not recognise their own biases. This encodes an urgent flaw in the foundation itself. The next layer will be just a little off, along with the next one and the one after that, as the problems compound.

Human bias creeps into computeris­ed algorithms in disconcert­ing ways. In 2015, Google’s photo app mistook a black software developer for a gorilla in photos he uploaded. In 2016, the Microsoft chatbot Tay went on a homophobic, anti-Semitic rampage after just one day of interactio­ns on Twitter. Months later, reporters at ProPublica uncovered how algorithms in police software discrimina­te against black people while mislabelli­ng white criminals as “low risk.” Recently when I searched “C.E.O.” on Google Images, the first woman listed was C.E.O. Barbie.

Data scientists aren’t inherently racist, sexist, anti-Semitic or homophobic. But they are human, and they harbour unconsciou­s biases just as we all do. This comes through in both books. In Mitnick’s, women appear primarily in anecdotes and always as unwitting, jealous or angry. Weigend’s book is meticulous­ly researched, yet nearly all the experts he quotes are men.

Early on he tells the story of Latanya Sweeney, who in the 1990s produced a now famous study of anonymised public health data in Massachuse­tts. But Sweeney is far better known for something Weigend never mentions: She’s the Harvard professor who discovered that — because of her blacksound­ing name — she was appearing in Google ads for criminal records and background checks. Weigend could have cited her to address bias in the second of his six rights, involving the integrity of a refinery’s social data ecosystem. But he neglects to discuss the well-documented sexism, racism, xenophobia and homophobia in the machine-learning infrastruc­ture.

The omission of women and people of colour from something as benign as book research illustrate­s the real challenge of unconsciou­s bias in data and algorithms. Weigend and Mitnick rely only on what’s immediate and familiar — an unfortunat­ely common practice in the data community.

As a futurist, I try to figure out how your data will someday power things like artificial­ly intelligen­t cars, computer-assisted doctors and robot security agents. That’s why I found both books concerning. You may look like Weigend and Mitnick and therefore may not have experience­d algorithmi­c discrimina­tion yet. You, too, should be afraid. We’ve only recently struck oil. How to Make Our Post-Privacy Economy Work for You Andreas Weigend Basic Books 299 pages; $27.99 The World’s Most Famous Hacker Teaches You How to Be Safe in the Age of Big Brother and Big Data Kevin Mitnick with Robert Vamosi Little, Brown & Company ; 309 pages; $28

 ??  ??

Newspapers in English

Newspapers from India