Montreal Gazette

PROTECTED AND LEAKED

Google exposing youths’ identities

- ANDREW DUFFY

Google’s powerful search engine is defeating some court-ordered publicatio­n bans in Canada and underminin­g efforts to protect young offenders and victims.

Computer experts believe it’s an unintended, “mind-boggling” consequenc­e of Google search algorithms.

In six high-profile cases documented by the Ottawa Citizen, searching the name of a young offender or victim online pointed to media coverage of their court cases, even though their names do not appear anywhere in the news articles themselves.

It’s a curious anomaly that appears to apply primarily to results produced by Google’s search engine. Similar searches in Bing and Yahoo! do not link the protected names to news coverage with the same consistenc­y.

Informed of the findings, a Google spokespers­on said the company would take action based on individual complaints. “If search results that violate local laws are brought to our attention, we’ll remove them,” Google Canada’s Aaron Brindle said in a written statement.

The problem was discovered by Citizen court reporter Gary Dimmock, who found that a Google search of a young offender’s name linked to news articles about the youth’s case.

In that case, an Ottawa youth was found guilty of making a series of fake bomb threats and 911 calls that triggered police SWAT team responses to schools, shopping malls and personal residences across North America. He was 16 at the time and, as a result, his identity was barred from publicatio­n by provisions of the Youth Criminal Justice Act.

Dimmock asked editors to investigat­e the situation and ensure that the newspaper hadn’t accidental­ly encoded the youth’s name onto its website. That subsequent investigat­ion found the problem did not lie with the newspaper or its online source code.

There was also metadata to consider. Metadata includes various informatio­n fields that reporters or editors fill out to describe an article, its photos or its videos, but which is not visible to the public. Again, nowhere in the metadata did the name appear.

Further inquiries revealed a similar issue existed in at least five other high-profile Ottawa cases in which publicatio­n bans had been in effect, as well as in court cases in Montreal and Windsor.

Media outlets are privy to the names of people whose identities are protected by publicatio­n bans by virtue of having a reporter in the courtroom.

A Google search for the name of an Ottawa-based RCMP officer convicted of confining, starving and abusing his son links to coverage of that court case. The officer’s identity is protected by a judge’s order designed to shield his son from publicity.

The officer’s name had never been reported by the Citizen or any other media outlet. The abused boy, now 15, was never identified in any article published online. Yet a search of the boy’s name produces results that link to coverage of the case.

In another high-profile case, a 17-year-old was charged in November 2016 with a series of offences after local mosques, synagogues and a church were spray-painted with hateful graffiti. The young offender has never been named in online news articles, but a search of his name connects to them.

In yet another case, searching the name of a 1970s sex-assault victim revealed court coverage of the trial of a Kingston teacher and Ontario Hockey League billet found to have abused him.

In Windsor, former OHL player Ben Johnson was convicted last year of sexually assaulting a “near comatose” 16-year-old girl at a nightclub. The victim’s name, protected by a publicatio­n ban, is now tied to online court coverage of the case.

The Google feedback loop means that in select cases, court-ordered publicatio­n bans are being undermined.

Most online news articles are archived on the web without an expiry date, which means the pieces will exist there for years to come.

Potential employers, coworkers, friends, partners — anyone searching an individual’s name — can therefore be linked to online coverage that exposes the individual’s past. The situation allows what the court-ordered bans were designed to protect against.

Ottawa lawyer Michael Crystal said he believes Google’s search results may violate Canadian law and open the company to a class-action lawsuit from those whose privacy has been violated.

“It’s very frightenin­g that so many people could have access to this informatio­n and it’s virtually unprotecte­d,” said Crystal, a privacy and data breach specialist.

“In the battlefiel­d of privacy law, it seems like we’re always fighting a rearguard action in terms of protecting privacy.”

Google’s Aaron Brindle declined to answer questions about how the search engine can link a given name to court coverage that doesn’t contain it. “Hundreds of factors contribute to what search results appear for a given query,” he said, “including things like PageRank, the specific words that appear on websites, the freshness of content and your region.”

Google co-founder Larry Page once described the perfect search engine as “understand­ing exactly what you mean and giving you back exactly what you want.”

To that end, Google uses programmin­g algorithms to instantly sort through trillions of web pages to produce the most relevant results. It handles more than 3.5 billion searches a day, and dominates the search engine landscape. The company is constantly refining its algorithms so that its searches do not simply return pages with the most matching “keywords,” but rather those with the most relevant informatio­n. Last year alone, Google made 1,600 tweaks to its search engine algorithm.

A former company executive has revealed Google search results are sometimes informed by common search patterns.

In May 2012, Amit Singhal, then a Google senior vice-president, wrote a blog post to unveil major changes to the search engine that tapped into the “collective intelligen­ce of the web.”

“We can now sometimes help answer your next question before you’ve asked it because the facts we show are informed by what other people have searched for,” Singhal wrote.

The inner workings of Google’s search formula is a closely guarded secret. But based on publicly available informatio­n, it’s possible to make educated guesses about how publicatio­n bans are being thwarted.

In some cases, including the case of the young Ottawa swatter, a protected name was used in social media or in a blog by private individual­s — not media outlets — that described the incident in question and linked to coverage. Evidence suggests Google’s algorithm learns to associate web pages with search queries when a link to the page frequently appears alongside a particular phrase.

Some people have used the algorithm to make mischief. For example, in a coordinate­d effort launched by a political blogger in 2003, the search terms “miserable failure” linked to pages about then-U.S. president George W. Bush. The resultsalt­ering practice became known as “Google bombing.”

Other publicatio­n ban cases are harder to explain, particular­ly those in which a protected name has never appeared anywhere on the web in connection to a crime. Such is the case for the unnamed RCMP officer found guilty of abusing his son.

Aristides Gionis, a computer science professor at Finland’s Aalto University, suggested the search results may be influenced by common search patterns.

For example, people who know or suspect they know the identity of someone whose name is covered by a publicatio­n ban might search for terms such as “John Doe RCMP Ottawa child abuse” and click on links to articles about the case. If enough people do that, the search engine might learn to associate those articles with searches for “John Doe,” even though the name does not appear in the article.

“Search engines are smart enough to learn some hidden associatio­ns … but not smart enough to know that the associatio­ns should not be used in certain cases,” explained Gionis, an expert in data mining and algorithmi­c data analysis.

University of Toronto computer science professor Periklis Andritsos, a datamining expert, said he could not isolate a technical explanatio­n for the phenomenon. “It’s mind-boggling,” he said. “I would like to be able to say 100 per cent this is it, but I can’t.”

The best theory Andritsos could offer is that a critical number of people have used the protected name alongside similar search terms, thereby establishi­ng a pattern of links to news coverage.

“If there are a lot of links pointing to these sites, it could be the case,” he said. “It must be accidental, but I can’t find explanatio­ns for it.”

Montreal lawyer Allen Mendelsohn said the case involves an untested area of the law since no one employed by Google would know the protected names.

“It must be Google’s technology figuring things out on its own, scraping informatio­n in an artificial­ly intelligen­t way,” said Mendelsohn, an internet law specialist who teaches privacy law at McGill University.

“This could be a violation of a publicatio­n ban, but whether it is or not is a bit more complicate­d because Google is not seen as a publisher, per se, and that adds complexity to the situation.”

WE CAN NOW SOMETIMES HELP ANSWER YOUR NEXT QUESTION BEFORE YOU’VE ASKED IT.

 ??  ??
 ??  ??
 ?? JIM WELLS / POSTMEDIA FILES ?? An Ottawa lawyer believes Google’s search results may open the company to a class-action lawsuit from those whose privacy has been violated.
JIM WELLS / POSTMEDIA FILES An Ottawa lawyer believes Google’s search results may open the company to a class-action lawsuit from those whose privacy has been violated.

Newspapers in English

Newspapers from Canada