The sheer amount of digital content that is created by users can be overwhelming. Moderating this content by human review is not only time-consuming and ineffective, it can endanger the mental health of the moderators. One of the primary challenges of Trust and Safety is finding an efficient, accurate solution to deal with the volume of content moderation that must be done to protect user safety.
Read the blog: Using AI for Content Moderation
There are a number of different ways that user-generated content can violate community guidelines. Hate speech, cyberbullying, radicalization, illegal solicitation, violent or explicit content – each of these are subject to prohibition by platforms. However, each behavior has different targets, perpetrators, and methods, requiring adaptive solutions to address different situations.
The tactics that users employ to engage in inappropriate activities are constantly changing – in part, to evade simplified automated solutions, such as keyword or profanity filters. Trust and Safety teams must devise processes and solutions that answer a platform’s current needs, while keeping abreast of changes and and evolving methods for the future, such as l33t speak.
Online platforms continue to develop new ways for users to communicate with one another. A social platform that launched with an exchange of comments may later add the ability to post a photo. During social distancing, many dating apps incorporate video chat as a way to get people together while separated.
However, Trust and Safety processes that work on one channel may not work on another. This is where interdepartmental commitment to promoting Trust and Safety is critical. Before a new channel is launched, it should be designed, developed, and tested to ensure a safe and inclusive environment for all users.
Similar to opening a platform to new channels – supporting new languages should be a thoughtful, measured, and tested initiative. At the very least, community guidelines should be translated into a new language before the company supports it, because failing to do so can result in inappropriate or abusive behaviors on your platform.
For example, Facebook ‘officially’ supports 111 languages with menus and prompts; and Reuters found an additional 31 languages commonly used on the platform. However, the Facebook community guidelines were only translated into 41 different languages: meaning that users speaking 60-90 different languages were not informed of what represents inappropriate content on Facebook.
Governments worldwide are demanding that online platforms actively moderate the content their users share and remove what appears on their platforms. The Australian Online Safety Bill 2021 seeks to tame cyber abuse, cyberbullying, and unwanted sharing of intimate content. It also gives the eSafety Commissioner more powers to compel platforms to remove toxic content and provide details of users who post offensive content. The UK is working to strengthen its Online Safety Bill to prevent the spread of illegal content and activity such as images of child abuse, terrorist material, and hate crimes. In the US, the GDPR, CCPA, and CPRA create new classes of “sensitive data” and offer users more control over it.
Learn More: Regulatory Changes for Trust & Safety Teams
Finally, one of the trickiest aspects of building a safe, inclusive environment for users is managing behavioral nuances. It can be difficult to identify and respond to behavior without a person reviewing content: but then, this is an extraordinarily resource-intensive, inefficient solution.
Luckily, technological advancements are being applied to answering Trust and Safety for online platforms. Artificial intelligence (AI) can help to automate the identification and initial response to inappropriate user behaviors for different platforms, with different thresholds of what constitutes appropriate behavior. For example, Spectrum Labs offers an AI-based solution that moderates content in context, reading the nuance of different situations, reducing the need for human moderation by 50%.