Milwaukee Journal Sentinel

Potential privacy lapse found in 2010 census

- Seth Borenstein

WASHINGTON – An internal team at the Census Bureau found that basic personal informatio­n collected from more than 100 million Americans in 2010 could be reconstruc­ted from obscured data, but with lots of mistakes, a top agency official disclosed Saturday.

The age, gender, location, race and ethnicity for 138 million people were potentiall­y vulnerable. So far, however, only internal hacking teams have discovered such details at possible risk, and no outside groups are known to have grabbed data intended to remain private for 72 years, chief scientist John Abowd said.

The Census Bureau is scrapping its old data shielding technique for a stateof-the-art method that Abowd claimed is better than Google’s or Apple’s.

Some former agency chiefs fear the potential privacy problem will add to the worries that people will avoid answering or lie on the once-every-10year survey because of the Trump administra­tion’s attempt to add a muchdebate­d citizenshi­p question.

The Supreme Court on Friday announced that it would rule on that proposed question, which has been criticized for being political and not properly tested in the field. The census count is hugely important, helping with the allocation of seats in the House of Representa­tives and distributi­on of billions of dollars in federal money.

The 8 billion pieces of statistics in census data are supposed to be jumbled in a way so what is released publicly for research cannot identify individual­s for more than seven decades. In 2010, the Census Bureau did this by swapping similar household informatio­n from one city to another, according to Duke University statistics professor Jerome Reiter.

In the internal tests, Abowd said, officials were able to match 45 percent of the people who answered the 2010 census with informatio­n from public and commercial data sets such as Facebook. But errors in this technique meant that only data for 52 million people would be completely correct – little more than 1-in-6 of the U.S. population.

He said the 2010 census used the best possible privacy protection available, but hackers since then have become more skilled in reconstruc­ting data.

To counter their growing abilities, the agency has completely changed the system for 2020 and will offer the “gold standard” of privacy regardless of the fate of the citizenshi­p question, Abowd said.

People “want to know that statistica­l tables aren’t going to come back and haunt them,” Abowd said at the American Associatio­n for the Advancemen­t of Science’s annual meeting.

“I promise the American people they will have the privacy that they deserve.”

Georgetown University provost Robert Groves, who headed the 2010 census, said the count had the proper privacy and that every census improves. He lauded the new steps.

Former agency chief Kenneth Prewitt, a professor of policy at Columbia University, said the basic informatio­n such as age and ethnicity, even if publicly revealed, isn’t as big a deal as other data breaches.

“There is a widespread privacy anxiety out there that is very much related to Facebook and Google and so forth,” Prewitt said. “I’m much more worried about the fact that my iPhone follows me around every day.”

In a statement, Apple’s Fred Sainz took issue with such privacy concerns: “The iPhone doesn’t follow you around all day long – Apple has no idea where you are nor do we care. And Apple does not sell informatio­n to companies.”

Abowd said “the 2020 census will be the safest and best protected ever. And this is not as easy as it sounds.”

The new system involves complex mathematic­al algorithms that inject “noise” into the data, making it harder to get accurate informatio­n and providing “a very strong guarantee” of privacy, said Duke University computer sciences professor Ashwin Machanavaj­jhala.

Newspapers in English

Newspapers from United States