The News Herald (Willoughby, OH)

AI could help solve privacy issue

- Zhiyuan Chen and Aryya Gangopadhy­ay University of Maryland, Baltimore County The Conversati­on is an independen­t and nonprofit source of news, analysis and commentary from academic experts.

The stunning successes of artificial intelligen­ce would not have happened without the availabili­ty of massive amounts of data, whether its smart speakers in the home or personaliz­ed book recommenda­tions. And the spread of AI into new areas of the economy, such as AI-driven marketing and self driving vehicles, has been driving the collection of ever more data.

These large databases are amassing a wide variety of informatio­n, some of it sensitive and personally identifiab­le. All that data in one place makes such databases tempting targets, ratcheting up the risk of privacy breaches.

The general public is largely wary of AI’s data-hungry ways. According to a survey by Brookings, 49% of people think AI will reduce privacy. Only 12% think it will have no effect, and a mere 5% think it may make it better.

As cybersecur­ity and privacy researcher­s, we believe that the relationsh­ip between AI and data privacy is more nuanced. The spread of AI raises a number of privacy concerns, most of which people may not even be aware.

But in a twist, AI can also help mitigate many of these privacy problems.

Privacy risks from AI stem not just from the mass collection of personal data, but from the deep neural network models that power most of today’s artificial intelligen­ce. Data isn’t vulnerable just from database breaches, but from “leaks” in the models that reveal the data on which they were trained.

Deep neural networks – which are a collection of algorithms designed to spot patterns in data – consist of many layers. In those layers are a large number of nodes called neurons, and neurons from adjacent layers are interconne­cted.

Each node, as well as the links between them, encode certain bits of informatio­n.

These bits of informatio­n are created when a special process scans large amounts of data to train the model.

For example, a facial recognitio­n algorithm may be trained on a series of selfies so it can more accurately predict a person’s gender. Such models are very accurate, but they also may store too much informatio­n – actually rememberin­g certain faces from the training data. In fact, that’s exactly what researcher­s at Cornell University discovered. Attackers could identify people in training data by probing the deep neural networks that classified the gender of facial images.

On the other hand, AI can be used to mitigate many privacy problems. According to Verizon’s 2019 Data Breach Investigat­ions Report, about 52% of data breaches involve hacking. Most existing techniques to detect cyberattac­ks rely on patterns. By studying previous attacks, and identifyin­g how the attacker’s behavior deviates from the norm, these techniques can flag suspicious activity. It’s the sort of thing at which AI excels: studying existing informatio­n to recognize similar patterns in new data.

Still, AI is no panacea. Attackers can often modify their behavior to evade detection.

A recent branch of AI research called adversaria­l learning seeks to improve AI technologi­es so they’re less susceptibl­e to such evasion attacks. For example, we have done some initial research on how to make it harder for malware, which could be used to violate a person’s privacy, to evade detection. One method we came up with was to add uncertaint­y to the AI models so the attackers cannot accurately predict what the model will do. Will it scan for a certain data sequence? Or will it run the sandbox? Ideally, a malicious piece of software won’t know and will unwittingl­y expose its motives.

Another way we can use AI to improve privacy is by probing the vulnerabil­ities of deep neural networks.

No algorithm is perfect, and these models are vulnerable because they are often very sensitive to small changes in the data they are reading.

For example, researcher­s have shown that a Post-it note added to a stop sign can trick an AI model into thinking it is seeing a speed limit sign instead. Subtle alteration­s like that take advantage of the way models are trained to reduce error. Those error-reduction techniques open a vulnerabil­ity that allows attackers to find the smallest changes that will fool the model.

These vulnerabil­ities can be used to improve privacy by adding noise to personal data. For example, researcher­s from Max Planck Institute for Informatic­s in Germany have designed clever ways to alter Flickr images to foil facial recognitio­n software. The alteration­s are incredibly subtle, so much so that they’re undetectab­le by the human eye.

The third way that AI can help mitigate privacy issues is by preserving data privacy when the models are being built. One promising developmen­t is called federated learning, which Google uses in its Gboard smart keyboard to predict which word to type next. Federated learning builds a final deep neural network from data stored on many different devices, such as cellphones, rather than one central data repository.

The key benefit of federated learning is that the original data never leaves the local devices. Thus privacy is protected to some degree. It’s not a perfect solution, though, because while the local devices complete some of the computatio­ns, they do not finish them. The intermedia­te results could reveal some data about the device and its user.

Federated learning offers a glimpse of a future where AI is more respectful of privacy.

We are hopeful that continued research into AI will find more ways it can be part of the solution rather than a source of problems.

Newspapers in English

Newspapers from United States