The Japan News by The Yomiuri Shimbun

Head shots of victims included in data for generative AI training

- By Takushi Kuwahara Yomiuri Shimbun Staff Writer

Many head shots of victims of incidents and disasters have been included in data used to train image generative artificial intelligen­ce, The Yomiuri Shimbun has found.

Such images are believed to have been used without permission after being collected from news websites and other online sites. As the possibilit­y of AI tools generating images similar to those of the victims cannot be ruled out, this practice is likely to spark debate about its pros and cons.

“I can't believe my daughter's photo has been used for such things,” said Yukimi Takahashi, 61, whose daughter, Matsuri, then a 24-year-old new employee of advertisin­g giant Dentsu Inc., committed suicide due to overwork in 2015.

After Matsuri's death, Takahashi provided a face photo of her daughter to news organizati­ons while also posting it on social media. She did so to raise public awareness of the reality of overwork, hoping to prevent similar incidents from happening again.

However, the photo was included in a dataset used by Stable Diffusion, one of the most popular AI models in the world.

The inclusion of such photos was found after The Yomiuri Shimbun examined the dataset, which was released online, last December.

“I want [such photos] not to be used for irrelevant AI training,” Takahashi said. Quake victims also included Generative AI creates elaborate images that are indistingu­ishable from illustrati­ons or photograph­ed images simply by giving it instructio­ns. Vast amounts of data are required for AI training to improve the precision of the images.

According to Stability AI Ltd., the British startup that developed Stable Diffusion, the AI model uses data provided by German nonprofit organizati­on LAION.

The dataset contains about 5.8 billion images. Besides the photo of Matsuri, many other images of victims of incidents and accidents have been found. The data also contains photos of the children who were victimized by a serial killer in Kobe in 1997 as well as those of four victims of the so-called Setagaya family murder incident in Tokyo in 2000. There were also images of victims of disasters, such as the 2011 Great East Japan Earthquake, and the Sept. 11, 2001, terrorist attack in the United States.

The images on the dataset have been collected via a program that automatica­lly goes through the internet and collects data from news sites and online bulletin boards, on which victim images have been taken from the news sites and elsewhere.

Response is ‘insufficie­nt' There have been cases in which news reports carry head shots of victims in order to convey the reality of incidents and disasters.

Datasets used to train image generative AI contain images collected mechanical­ly and indiscrimi­nately, regardless of the content of the images. Given that, unauthoriz­ed AI learning of illustrati­ons and other copyrighte­d materials has been also considered problemati­c.

The Yomiuri Shimbun previously found sexual images of real children in the dataset. The images could amount to violations of the law to ban child prostituti­on and child pornograph­y.

According to experts, AI could possibly generate images similar to those it has learned. So it is possible that images could be generated that defame the victims, or that the images may be misused to disseminat­e false informatio­n.

In response to an email interview, Stability AI said that there is a mechanism to exclude certain data from the subject for AI training if requested. However, the company did not respond to the question about whether it was aware that the dataset contained head shots of victims.

“From the point of view of the bereaved families who have made public the victims' headshots to stress the lessons from the incidents and disasters, it is unexpected that such images have been used for AI learning, and this involves the dignity of the deceased,” said Akiko Orita, a professor of informatio­n sociology at Kanto Gakuin University who specialize­s in digital data of the dead.

“It is different from news reporting, which is in the public interest. So it's not sufficient for AI developers to respond to the matter simply by saying, ‘If there is a request, we will exclude the image,'” she added.

On the other hand, apart from this issue of unauthoriz­ed AI training, demand for AI creations of deceased family members may possibly grow in the future.

“With the use of AI spreading, it is necessary for society as a whole to discuss how to protect the feelings of bereaved families and respect the dignity of the dead,” Orita said.

 ?? ??

Newspapers in English

Newspapers from Japan