On March 19, Meta published a blog post announcing a "wide rollout" of its AI support assistant across Facebook and Instagram. Buried in the third paragraph was the sentence that matters: the company will "reduce our reliance on third-party vendors" for content enforcement.
That is corporate language for laying off thousands of contract workers whose job, for the last decade, has been to watch the worst things humans post online so the rest of us don't have to.
Meta says the transition will take "a few years." It did not say how many contractors would lose their positions, or when. It did not name the vendors affected, though CNBC reported the company has historically relied on firms like Accenture, Concentrix, and Teleperformance. What Meta did provide was a set of performance claims for its new AI systems and a promise that humans would still handle "the most complex, high-impact decisions" involving law enforcement and account appeals.
What the AI can actually do
Meta's announcement included specific numbers, which is unusual for a company that typically keeps moderation metrics close to the chest.
The company says its AI systems now catch 5,000 scam attempts per day that no existing human review team had identified. It claims an 80% reduction in user reports of celebrity impersonation accounts. For adult sexual solicitation content, Meta says AI catches twice as much as human teams while making 60% fewer "overenforcement mistakes," meaning fewer legitimate posts get wrongly removed. The new systems also cover languages spoken by 98% of people online, up from roughly 80 languages under the previous setup.
These are real improvements if the numbers hold. Scam detection and fake account removal are high-volume, pattern-matching problems where machine learning has clear advantages over human reviewers scrolling through queues.
But content moderation is not just scam detection.
Where AI moderation falls apart
The hard cases in content moderation have never been the obvious ones. They are the satirical post that looks like a threat. The documentary photograph that an algorithm flags as graphic violence. The political speech that reads differently in Burmese than in English. The breastfeeding photo, the war reporting image, the drag performance that a classifier trained on American norms labels as sexual content.
A 2025 paper in Artificial Intelligence Review argued that accuracy metrics for LLM-based moderation are "insufficient and misleading" because they fail to distinguish between easy cases and hard cases. Getting 99% accuracy on obvious spam is a different achievement than getting 99% accuracy on political satire in Tigrinya.



