The Mercury News

Researcher­s question Census Bureau’s new approach to privacy

-

PROVIDENCE, R.I. >> In an age of rapidly advancing computer power, the U.S. Census Bureau recently undertook an experiment to see if census answers could threaten the privacy of the people who fill out the questionna­ires.

The agency went back to the last national headcount, in 2010, and reconstruc­ted individual profiles from thousands of publicly available tables. It then matched those records against other public population data. The result: Officials were able to infer the identities of 52 million Americans. Confronted with that discovery, the bureau announced that it would add statistica­l “noise” to the 2020 data, essentiall­y tinkering with its own numbers to preserve privacy. But that idea creates its own problems, and social scientists, redistrict­ing experts and others worry that it will make next year’s census less accurate. They say the bureau’s response is overkill.

“This is a brand-new, radically more conservati­ve definition of privacy,” University of Minnesota demographe­r Steven Ruggles said.

Federal law bars census officials from disclosing any individual’s responses. But data-crunching computers can tease out likely identities from the broader census results when combined with other personal informatio­n.

Some critics fear the agency’s changes could make it harder to draw new congressio­nal and legislativ­e districts accurately. Others worry that research on immigratio­n, demographi­cs, the opioid epidemic and declining life expectancy will be hindered, particular­ly when it involves less populated areas.

If the change had been in place four years ago, Ruggles said, he would not have been able to conduct a 2015 study on the impact of declines in young men’s incomes on marriage. With more and more data sets available to the public with a quick download, it has become easier than ever to match informatio­n with real names. That means aggregated answers to census questions involving race, housing and relationsh­ips could lead to individual­s.

The fear is that advertiser­s, market researcher­s or anybody with know-how and curiosity could use data to reconstruc­t the identities of census respondent­s.

When the bureau went back to the 2010 census, it matched the census data with commercial databases. More than 1 in 6 respondent­s were identified by name and neighborho­od as well as by informatio­n about their race, ethnicity, sex and age.

Since the last census, “the data world has changed dramatical­ly,” Ron Jarmin, deputy director of the census agency wrote earlier this year. “Much more personal informatio­n is available online and from commercial providers, and the technology to manipulate that data is more powerful than ever.”

The unsuccessf­ul effort by President Donald Trump’s administra­tion to add a citizenshi­p question to the 2020 questionna­ire heightened fears about how census informatio­n would be used. But privacy concerns are nothing new for the bureau. Historians have found evidence that census data helped identify Japanese Americans who were rounded up and confined to camps during World War II. That revelation led to an apology from then-census Bureau Director Kenneth Prewitt in 2000.

Jewish groups and some liberal organizati­ons had concerns about privacy when the bureau was lobbied to ask about religion for the 1960 census. Some noted that Nazis had used government and church records to identify and round up Jews. The idea never went anywhere.

During the legal battle over the citizenshi­p question, advocates worried that the informatio­n could be used to target residents in the country illegally. Some say lingering concerns could have a chilling effect on the 2020 census.

To address those worries, the bureau has adopted a technique called “differenti­al privacy,” which alters the numbers but does not change core findings to protect the identities of individual respondent­s.

It’s analogous to pixilating the data, a technique commonly used to blur certain images on television, said Michael Hawes, senior adviser for data access and privacy at the Census Bureau.

Redistrict­ing experts say the mathematic­al blurring could cause problems because they rely on precise numbers to draw congressio­nal and state and local legislativ­e districts. They also worry that it could dilute minority voting power and violate the Voting Rights Act.

“The numbers might be off by five, 10, 20 people, and if you’re dealing with exact percentage­s, that could mean something. That could mean a lot,” said Jeffrey M. Wice, a national redistrict­ing attorney. “That’s why we care about it so much.”

In the past, the bureau has used swapping and other methods to protect confidenti­ality. Swapping is taking similar households in different geographic areas and exchanging demographi­c characteri­stics.

Census data does not need to be exact for most purposes, “as long as we know it’s really pretty close,” said Justin Levitt, an election law professor at Loyola Law School in Los Angeles. But “there’s certainly a point where blurry becomes too blurry.”

 ?? THE ASSOCIATED PRESS ?? The U.S. Census Bureau is creating tighter privacy controls in response to new fears that census questions could threaten the privacy of the people who answered them. But social scientists and others worry that the change will hurt the accuracy of the 2020count.
THE ASSOCIATED PRESS The U.S. Census Bureau is creating tighter privacy controls in response to new fears that census questions could threaten the privacy of the people who answered them. But social scientists and others worry that the change will hurt the accuracy of the 2020count.

Newspapers in English

Newspapers from United States