In this month’s column, we discuss the problem of content moderation and compliance checking using NLP techniques.
We have been discussing the machine reading comprehension task over the last couple of months. This month, we take a break from that discussion and focus on a real life problem, which NLP (natural language processing) can help solve. Let us start with a question to our readers. We all know what the coolest job in information technology is these days. As Hal Valerian, Google’s lead statistician remarked a few years back, it is the job of the data scientist (https://hbr.org/2012/10/data-scientist-the-sexiestjob-of-the-21st-century). But do you know what the worst technology job is? Well, it is that of the content moderators on social media sites such as Facebook or YouTube. Their job is to constantly sift through the user generated content (UGC) getting posted on the websites and filter out content that is abusive.
Content moderation requires analysing a wide variety of user generated content – blogs, emails on community forums, news/articles posted on social media sites, tweets, videos, photos, and even online games. Content moderators need to identify unsuitable/abusive content and ensure that it is taken down quickly.
Content moderation has gained a lot of public attention last year, when a user posted live videos of a killing, on Facebook. Sites such as YouTube and Facebook employ a large number of human content moderators whose job is to ensure that abusive/illegal content is blocked from public viewing. This includes filtering out anything pornographic, violent visuals or language, exploitative images of minors, the soliciting of sexual favours, racist comments, etc, from the text, video or audio tracks posted on the Internet. However, performing this task leads to enormous stress and burnout among the human content moderators. There have even been cases of post-traumatic stress disorder (PTSD) being prevalent among people working in this space. In addition, as the volume of UGC on the Internet increases exponentially, human moderation cannot scale and often becomes error prone.
There are two basic types of content moderation – reactive and proactive. In reactive content moderation, the filtering happens offline, in the sense that after the content is posted, moderators scan it and decide whether it is acceptable or not. In proactive content moderation, as soon as the content is submitted, it is analysed for any objectionable content in real-time, before it gets posted.
Given the typical need for real-time filtering of objectionable content on the large social media sites, human moderation efforts lack the ability to prevent objectionable content from getting posted for public viewing, on time. Due to the issues associated with human content moderation, there has been a trend towards automated approaches for online content moderation. Large Internet sites such as Facebook and YouTube have invested heavily in developing machine learning/AI based tools for automatic content moderation. While content moderation is applicable to multiple media such as video, text and speech, in this column, we focus on the problem of content moderation for text.
Whatever be the form of the content, we first need to understand what makes this problem challenging.
Let us first consider text. The obvious approach is to create a lexicon of words, which are associated with abusive, hateful and objectionable text. Given this lexicon, it is straightforward to flag objectionable content. Yet, why doesn’t this approach work? There are a number of reasons for this. First and foremost, people who create and post objectionable content always look for ways to circumvent the content moderation scheme. For instance, if your objectionable content lexicon contains the word ‘bullshit’, the submitter can use ‘bulls**t’ to fool the lexicon. While simpler regular expressions can be caught by intelligent processing, people also circumvent moderation by using an innocuous-sounding word instead of the objectionable one (for instance using ‘grape’ in place of ‘rape’). Hence simple lexicon based systems are easily circumvented by intelligent workarounds.