Social media platforms regularly engage in “content moderation”—the depublication, downranking, and sometimes outright censorship of information and/or user accounts from social media and other digital platforms, usually based on an alleged violation of a platform’s “community standards” policy. In recent years, this practice has become a matter of intense public interest. Not coincidentally, thanks to growing pressure from governments and some segments of the public to restrict various types of speech, it has also become more pervasive and aggressive, as companies struggle to self-regulate in the hope of avoiding legal mandates.

Many of us view content moderation as a given, an integral component of modern social media. But the specific contours of the system were hardly foregone conclusions. In the early days of social media, decisions about what to allow and what not to were often made by small teams or even individuals, and often on the fly. And those decisions continue to shape our social media experience today.

Roz Bowden—who spoke about her experience at UCLA’s All Things in Moderation conference in 2017—ran the graveyard shift at MySpace from 2005 to 2008, training content moderators and devising rules as they went along. Last year, Bowden told the BBC:

We had to come up with the rules. Watching porn and asking whether wearing a tiny spaghetti-strap bikini was nudity? Asking how much sex is too much sex for MySpace? Making up the rules as we went along. Should we allow someone to cut someone's head off in a video? No, but what if it is a cartoon? Is it OK for Tom and Jerry to do it?

Similarly, in the early days of Google, then-deputy general counsel Nicole Wong was internally known as “The Decider” as a result of the tough calls she and her team had to make about controversial speech and other expression. In a 2008 New York Times profile of Wong and Google’s policy team, Jeffrey Rosen wrote that as a result of Google’s market share and moderation model, “Wong and her colleagues arguably have more influence over the contours of online expression than anyone else on the planet.”

Built piecemeal over the years by a number of different actors passing through Silicon Valley’s revolving doors, content moderation was never meant to operate at the scale of billions of users. The engineers who designed the platforms we use on a daily basis failed to imagine that one day they would be used by activists to spread word of an uprising...or by state actors to call for genocide. And as pressure from lawmakers and the public to restrict various types of speech—from terrorism to fake news—grows, companies are desperately looking for ways to moderate content at scale.

They won’t succeed—at least if they care about protecting online expression even half as much as they care about their bottom line.

The Content Moderation System Is Fundamentally Broken. Let Us Count the Ways:

1. Content Moderation Is a Dangerous Job—But We Can’t Look to Robots to Do It Instead

As a practice, content moderation relies on people in far-flung (and almost always economically less well-off) locales to cleanse our online spaces of the worst that humanity has to offer so that we don’t have to see it. Most major platforms outsourcing the work to companies abroad, where some workers are reportedly paid as little as $6 a day and others report traumatic working conditions. Over the past few years, researchers such as EFF Pioneer Award winner Sarah T. Roberts have exposed just how harmful a job it can be to workers.

Companies have also tried replacing human moderators with AI, thereby solving at least one problem (the psychological impact that comes from viewing gory images all day), but potentially replacing it with another: an even more secretive process in which false positives may never see the light of day.

2. Content Moderation Is Inconsistent and Confusing

For starters, let’s talk about resources. Companies like Facebook and YouTube expend significant resources on content moderation, employing thousands of workers and utilizing sophisticated automation tools to flag or remove undesirable content. But one thing is abundantly clear: The resources allocated to content moderation aren’t distributed evenly. Policing copyright is a top priority, and because automation can detect nipples better than it can recognize hate speech, users often complain that more attention is given to policing women’s bodies than to speech that might actually be harmful.

But the system of moderation is also inherently inconsistent. Because it relies largely on community policing—that is, on people reporting other people for real or perceived violations of community standards—some users are bound to be more heavily impacted than others. A person with a public profile and a lot of followers is mathematically more likely to be reported than a less popular user. And when a public figure is removed by one company, it can create a domino effect whereby other companies follow their lead.

Problematically, companies’ community standards also often feature exceptions for public figures: That’s why the president of the United States can tweet hateful things with impunity, but an ordinary user can’t. While there’s some sense to such policies—people should know what their politicians are saying—certain speech obviously carries more weight when spoken by someone in a position of authority.

Finally, when public pressure forces companies to react quickly to new “threats,” they tend to overreact. For example, after the passing of FOSTA—a law purportedly designed to stop sex trafficking but which, as a result of sweepingly broad language, has resulted in confusion and overbroad censorship by companies—Facebook implemented a policy on sexual solicitation that was essentially a honeypot for trolls. In responding to ongoing violence in Myanmar, the company created an internal manual that contained elements of misinformation. And it’s clear that some actors have greater ability to influence companies than others: A call from Congress or the European Parliament carries a lot more weight in Silicon Valley than one that originates from a country in Africa or Asia. By reacting to the media, governments, or other powerful actors, companies reinforce the power that such groups already have.

3. Content Moderation Decisions Can Cause Real-World Harms to Users as Well as Workers

Companies’ attempts to moderate what they deem undesirable content has all too often had a disproportionate effect on already-marginalized groups. Take, for example, the attempt by companies to eradicate homophobic and transphobic speech. While that sounds like a worthy goal, these policies have resulted in LGBTQ users being censored for engaging in counterspeech or for using reclaimed terms like “dyke”. 

Similarly, Facebook’s efforts to remove hate speech have impacted individuals who have tried to use the platform to call out racism by sharing the content of hateful messages they’ve received. As an article in the Washington Post explained, “Compounding their pain, Facebook will often go from censoring posts to locking users out of their accounts for 24 hours or more, without explanation — a punishment known among activists as ‘Facebook jail.’”

Content moderation can also pose harms to business. Small and large businesses alike increasingly rely on social media advertising, but strict content rules disproportionately impact certain types of businesses. Facebook bans ads that it deems “overly suggestive or sexually provocative”, a practice that has had a chilling effect on women’s health startups, bra companies, a book whose title contains the word “uterus”, and even the National Campaign to Prevent Teen and Unwanted Pregnancy.

4. Appeals Are Broken, and Transparency Is Minimal

For many years, users who wished to appeal a moderation decision had no feasible path for doing so...unless of course they had access to someone at a company. As a result, public figures and others with access to digital rights groups or the media were able to get their content reinstated, while others were left in the dark.

In recent years, some companies have made great strides in improving due process: Facebook, for example, expanded its appeals process last year. Still, users of various platforms complain that appeals lack result or go unanswered, and the introduction of more subtle enforcement mechanisms by some companies has meant that some moderation decisions are without a means of appeal.

Last year, we joined several organizations and academics in creating the Santa Clara Principles on Transparency and Accountability in Content Moderation, a set of minimum standards that companies should implement to ensure that their users have access to due process and receive notification when their content is restricted, and to provide transparency to the public about what expression is being restricted and how.

In the current system of content moderation, these are necessary measures that every company must take. But they are just a start.  

No More Magical Thinking

We shouldn’t look to Silicon Valley, or anyone else, to be international speech police for practical as much as political reasons. Content moderation is extremely difficult to get right, and at the scale at which some companies are operating, it may be impossible. As with any system of censorship, mistakes are inevitable.  As companies increasingly use artificial intelligence to flag or moderate content—another form of harm reduction, as it protects workers—we’re inevitably going to see more errors. And although the ability to appeal is an important measure of harm reduction, it’s not an adequate remedy.

Advocates, companies, policymakers, and users have a choice: try to prop up and reinforce a broken system—or remake it. If we choose the latter, which we should, here are some preliminary recommendations:

  • Censorship must be rare and well-justified, particularly by tech giants. At a minimum, that means (1) Before banning a category of speech, policymakers and companies must explain what makes that category so exceptional, and the rules to define its boundaries must be clear and predictable. Any restrictions on speech should be both necessary and proportionate. Emergency takedowns, such as those that followed the recent attack in New Zealand, must be well-defined and reserved for true emergencies. And (2) when content is flagged as violating community standards, absent exigent circumstances companies must notify the user and give them an opportunity to appeal before the content is taken down. If they choose to appeal, the content should stay up until the question is resolved. But (3) smaller platforms dedicated to serving specific communities may want to take a more aggressive approach. That’s fine, as long as Internet users have a range of meaningful options with which to engage.
  • Consistency. Companies should align their policies with human rights norms. In a paper published last year, David Kaye—the UN Special Rapporteur on the promotion and protection of the right to freedom of opinion and expression—recommends that companies adopt policies that allow users to “develop opinions, express themselves freely and access information of all kinds in a manner consistent with human rights law.” We agree, and we’re joined in that opinion by a growing coalition of civil liberties and human rights organizations.
  • Tools. Not everyone will be happy with every type of content, so users should be provided with more individualized tools to have control over what they see. For example, rather than banning consensual adult nudity outright, a platform could allow users to turn on or off the option to see it in their settings. Users could also have the option to share their settings with their community to apply to their own feeds.
  • Evidence-based policymaking. Policymakers should tread carefully when operating without facts, and not fall victim to political pressure. For example, while we know that disinformation spreads rapidly on social media, many of the policies created by companies in the wake of pressure appear to have had little effect. Companies should work with researchers and experts to respond more appropriately to issues.

Recognizing that something needs to be done is easy. Looking to AI to help do that thing is also easy. Actually doing content moderation well is very, very difficult, and you should be suspicious of any claim to the contrary.