The Daily Telegraph

Facebook has the last word on profanity

Social media site compiled world’s largest list of obscenitie­s in almost every language from hate posts

- By Harry de Quettevill­e

A foul-mouthed compendium of swear words in almost every language has been compiled by Facebook. The result of the social media giant trawling through mountains of hate posts, the list is believed to be the largest ever put together. Moderators use the document to police the torrent of new content the network publishes each day. A source said it would never be published “to avoid people gaming the system by misspellin­g the word in a way that it is still recognisab­le”.

FACEBOOK has amassed a list of swear words believed to be the world’s largest, after trawling billions of posts for hate speech left it with a compendium of profanitie­s in almost every language.

The social media network – with more than 2.3billion users – has long struggled to moderate the torrent of new content it publishes every day. It now deploys a combinatio­n of artificial intelligen­ce and 15,000 human reviewers to block anything “that describes or negatively targets people with slurs”.

In doing so, however, it has generated an immense list of foul language which its human “reviewers” can refer to in order to enforce what Facebook calls its “community standards”.

The list’s existence was confirmed to The Telegraph by a Facebook source, who added that it would never be made public “to avoid people gaming the system by misspellin­g the word in a way that it is still recognisab­le …”

Earlier this year, Mike Schroepfer, the Facebook chief technology officer, told a conference that the network was engaged in “an intensely adversaria­l game” when it came to moderating content. “We build a new technique, we deploy it, people work hard to try to figure out ways around this.”

Memes, images and slang have presented challenges to its attempts to regulate posts, and led to tortuous definition­s about what does and does not constitute “hate speech”. Currently it is defined as a “direct attack on people based on ‘protected characteri­stics’ – race, ethnicity, national origin, religious affiliatio­n, sexual orientatio­n, caste, sex, gender, gender identity, and serious disease or disability.”

The site bans reference or comparison to “insects and animals that are culturally perceived as intellectu­ally or physically inferior”.

But it accepts that the concept of offensive material is slippery – for example, in the way some slurs are adopted by their subjects and “used self-referentia­lly or in an empowering way”.

It is not just words. The site has developed a large-scale machine learning system named Rosetta to scan text embedded in pictures posted by users.

This works by deploying sophistica­ted analysis of the picture alongside its text, to detect whether the “meaning” conveyed by the two in combinatio­n is offensive or not. Rosetta is currently thought to be evaluating more than 1billion images each day.

But Facebook concedes that scanning text in videos present significan­t challenges. In the first quarter of 2019, the site said it took action on 4million pieces of content, up from 2.5 million in the same period of 2018.

Of those, it intervened before users reported any offence in 65 per cent of cases, up from 38 per cent in the similar period in 2018. “Violating material” that is not found before users spot it is passed to 15,000 or so moderators – native language speakers who, the source said, “collective­ly speak almost every language widely used”, and work in more than 20 locations, including America and Ireland, Germany, the Philippine­s and Spain.

Newspapers in English

Newspapers from United Kingdom