Content moderation online is a massive challenge. Social media platforms host billions of users posting content daily. Reviewing all that content to filter out harmful or illegal material is an endless task requiring armies of human moderators. The toll on those workers’ mental health from being exposed to the worst of humanity has been well-documented. Now, AI may provide a solution to relieve some of that burden on people.
1. The Rising Need for Content Moderation
As social platforms have grown, the amount of user-generated content has exploded. In 2021 alone:
- Twitter saw over 500 million tweets posted every day
- Facebook had 2.91 billion monthly active users
- YouTube users uploaded over 500 hours of video per minute
At those scales, human reviewers can’t evaluate everything. Yet platforms are under increasing pressure to curb harmful content like:
- Terrorist propaganda
- Child sexual abuse imagery
- Coordinated disinformation campaigns
- Cyberbullying and harassment
Trying to moderate at scale with humans alone simply isn’t viable. The result has been reliance on underpaid outsourced contractors working under punishing conditions.
2. The Trauma Content Moderators Face
Working as a social media content moderator means looking at the worst behaviour and impulses of humanity day in and day out. Exposure to a constant onslaught of disturbing, violent, hateful or otherwise traumatic content takes a severe toll on workers’ mental health.
Studies have found moderator’s experience:
- Increased risk of PTSD symptoms
- Difficulty sleeping due to intrusive imagery and nightmares
- Anxiety, depression, and suicidal thoughts
Despite the harm, content moderators are often employed as contractors with minimal benefits or psychological support.
Platforms like Facebook have pledged to improve working conditions. But the core nature of the work makes it incredibly taxing on the human psyche.
3. Is AI-Assisted Moderation the Answer?
This is where artificial intelligence comes in. Automated content analysis tools powered by machine learning algorithms have the potential to significantly augment and reduce human moderation workloads.
AI moderation offers several advantages:
- Scales infinitely: AI systems can analyze huge volumes of content tirelessly.
- Acts rapidly: Algorithms can flag prohibited material nearly instantaneously.
- Improves over time: Machine learning allows the systems’ accuracy to incrementally improve.
- Lightens human load: People would only need to review the hardest grey-area cases.
Of course, developing highly accurate AI moderation is extremely challenging. Contextual nuance around dangerous speech or imagery makes it difficult to automate.
OpenAI, creators of ChatGPT, are working to advance large language models to help address content harms. Their research indicates AI can reach human accuracy for moderation tasks, while greatly enhancing scale and speed.
4. OpenAI’s Approach to AI Content Moderation
OpenAI sees AI moderation systems not as replacements for people, but as assistants to augment and support them. Their goal is to minimize exposure to the most harmful content.
Some key points about their approach:
- Focus on the Worst Content First: Priority is given to AI accurately flagging the most clearly policy-violating material like graphic violence or illegal imagery.
- Incremental Refinement: Models steadily improve through ongoing real-world use and tuning by engineers and expert moderators.
- Maintain Human Oversight: People remain in the loop to audit system outcomes, guide training, and handle judgment calls.
- Transparency: Documentation provided on model capabilities, limitations, and personal data handling.
Research from OpenAI indicates that AI can categorize text, images, and videos with human-level accuracy on certain hazards while offering massive scalability.
Of course, no automated system will ever be perfect. But even an accuracy rate of 95% would allow AI to pre-screen the vast majority of straightforward policy violations before a human ever sees them.
5. The Promise and Challenges of AI Moderation
AI augmentation has the potential to help protect moderator mental health and improve platform safety. But risks remain that require thoughtful design. Some key considerations:
- Reduce human exposure to traumatic content
- Enable moderators to focus on complex cases
- Improve response time to urgent threats
- Scale consistently across a flood of content
- Apply policies consistently without human bias
- AI inaccuracies leading to over-filtering
- Enablement of mass surveillance & predictive policing
- Entrenching bias in Training data
- Usage for authoritarian censorship
- Loss of human discretion and empathy
As with any powerful technology, AI moderation brings opportunities as well as some perils. But if carefully implemented, it can help ease the psychological toll on human reviewers working to make online spaces healthier.
6. The Future of AI-Augmented Moderation
Major platforms have only begun dipping their toes into AI moderation assistance. But continued research and development could produce a sea change in how the internet is governed.
Here are some possible next steps:
- Expand accuracy benchmarks on diverse real-world test cases
- Improve transparency into model decisions and error rates
- Implement AI pre-screening focused on the most explicit content
- Maintain sizable teams of human auditors and trainers
- Consult with moderators, mental health experts & civil society on process reforms
- Consider enabling users to voluntarily opt into AI moderation where appropriate
With diligent engineering and ethical implementation, AI augmentation may relieve the toll of content moderation on workers. And a less burdened, better-supported human workforce will remain vital to ensure automated systems fulfil their promise safely.
AI is no silver bullet. But combined with people-centric reforms, automation offers hope for healthier information platforms. Getting there will require sustained research and honest reckoning with the inevitable hard tradeoffs around speech, safety and human thriving online.