Incomplete responses to new hate terms

2018-11-05 -

I was expecting new threats to appear online surrounding the Pittsburgh shooting, and there were signs that was happening. In a recent anti-Semitic attack, Nation of Islam leader Louis Farrakhan used the word “termite” to describe Jewish people. I searched for this term, knowing racists were likely to use the new slur as a code word to avoid detection when expressing antiSemitism.

Twitter had not suspended Farrakhan’s account in the wake of yet another of his anti-Semitic statements, and Twitter’s search function automatically suggested I might be searching for the phrase “termite eats bullets.” That turns Twitter’s search box into a hate-speech billboard.

The company had, however, apparently adjusted some of its internal algorithms, because no tweets with anti-Semitic uses of the word “termite” showed up in my search results.

Posts unnoticed for years

As I continued my searches for hate speech and calls for violence against Jewish people, I found even more disturbing evidence of shortfalls in Twitter’s content moderation system. In the wake of the 2016 U.S. election and the discovery that Twitter was being used to influence the election, the company said it was investing in machine learning to “detect and mitigate the effect on users of fake, coordinated, and automated account activity.” Based on what I found, these systems have not identified even very simple, clear and direct violent threats and hate speech that have been on its site for years.

When I reported a tweet posted in 2014 that advocated killing Jewish people “for fun,” Twitter took it down the same day – but its standard automated Twitter notice gave no explanation of why it had been left untouched for more than four years.

Hate games the system

When I reviewed hateful tweets that had not been caught after all those years, I noticed that many contained no text – the tweet was just an image. Without text, tweets are harder for users, and Twitter’s own hate-identifying algorithms, to find. But users who specifically look for hate speech on Twitter may then scroll through the activity of the accounts they find, viewing even more hateful messages.

Twitter seems to be aware of this problem: Users who report one tweet are prompted to review a few other tweets from the same account and submit them at the same time. This does end up subjecting some more content to review, but still leaves room for

Incomplete responses to new hate terms

Newspapers in English

Newspapers from Malta