The Facebook insider developing content regulation for the AI age

The Facebook insider developing content regulation for the AI age

When Brett Levenson departed Apple in 2019 to oversee business integrity at Facebook, the social media platform was amid the aftermath of the Cambridge Analytica scandal. Initially, he believed he could rectify Facebook’s content moderation issues through enhanced technology. 

However, he soon discovered that the issue was more profound than just technology. Human reviewers were tasked with memorizing a 40-page policy document that had been translated into their native language by a machine, he explained. They then had approximately 30 seconds for each piece of flagged content to determine not only if it breached the rules but also what action to take: block it, ban the user, or limit its distribution. According to Levenson, those rapid decisions were only “slightly better than 50% accurate.”

“It was almost like flipping a coin to see if the human reviewers could correctly interpret the policies, and this was after several days of the harm already occurring,” Levenson stated to TechCrunch.

Such a delayed, reactive method is not viable in a landscape filled with agile and well-funded adversaries. The emergence of AI chatbots has only amplified the situation, as failures in content moderation have resulted in a series of notable incidents, including chatbots directing teens toward self-harm or AI-generated images bypassing safety filters.

Levenson’s dissatisfaction prompted the concept of “policy as code” — a method to convert static policy documents into actionable, updatable logic closely tied to enforcement. This insight led to the establishment of Moonbounce, which has disclosed a $12 million funding raise on Friday, as learned exclusively by TechCrunch. The funding round was co-led by Amplify Partners and StepStone Group.

Moonbounce collaborates with businesses to add an extra layer of safety wherever content is produced, whether by a human or by AI. The company has developed its own large language model to examine a client’s policy documents, assess content in real-time, provide a response in 300 milliseconds or less, and take appropriate action. Depending on the client’s preferences, this action might involve Moonbounce’s system delaying distribution while the content awaits a human review or could result in blocking high-risk content immediately. 

Currently, Moonbounce caters to three primary sectors: platforms handling user-generated content such as dating applications; AI companies creating characters or companions; and AI image generation services. 

Techcrunch event

San Francisco, CA
|
October 13-15, 2026

Moonbounce is facilitating over 40 million daily reviews and caters to more than 100 million daily active users on the platform, stated Levenson. Clients include AI companion startup Channel AI, image and video generation firm Civitai, and character roleplay platforms Dippy AI and Moescape. 

“Safety can actually serve as a product advantage,” Levenson remarked to TechCrunch. “It has never been perceived this way before because it has always been an afterthought, not something that can be integrated into the product. We observe that our clients are discovering fascinating and innovative methods to leverage our technology to make safety a competitive edge and part of their product narrative.”

Tinder’s trust and safety lead recently discussed how the dating service utilizes these types of LLM-powered solutions to achieve a tenfold improvement in detection accuracy.

“Content moderation has always been a challenge for major online platforms, but with LLMs now central to every application, this challenge has become even more formidable,” commented Lenny Pruss, general partner at Amplify Partners, in a statement. “We chose to invest in Moonbounce because we envision a future where objective, real-time guardrails become the essential infrastructure of every AI-mediated application.”

AI enterprises are under increasing legal and reputational pressure after chatbots have been criticized for guiding teenagers and vulnerable users towards suicidal thoughts, with image generators like xAI’s Grok being used to create non-consensual nude images. Clearly, internal safety measures are failing, and it’s turning into a liability issue. Levenson mentioned that AI companies are progressively seeking assistance from external sources to enhance their safety frameworks. 

“We act as a third party positioned between the user and the chatbot, which means our system isn’t overwhelmed with context in the same way the chat is,” Levenson explained. “The chatbot must remember, possibly, tens of thousands of tokens that have been exchanged previously…Our sole concern is enforcing the rules in real-time.”

Levenson co-manages the 12-person company with his former Apple associate Ash Bhardwaj, who previously developed large-scale cloud and AI infrastructure across Apple’s core products. Their upcoming focus is a feature known as “iterative steering,” created in response to incidents like the 2024 suicide of a 14-year-old boy from Florida who became fixated on a Character AI chatbot. Instead of an outright refusal when harmful subjects come up, the system would intervene in the conversation and reroute it, adjusting prompts in real time to steer the chatbot toward a more actively supportive response.

“We hope to incorporate into our action toolkit the capability to guide the chatbot in a more positive direction, effectively modifying the user’s prompt to compel the chatbot to be not only an empathetic listener but also a helpful listener in such situations,” Levenson stated. 

When queried about whether his exit strategy involved an acquisition by a company such as Meta, completing his journey with content moderation, Levenson acknowledged how well Moonbounce would integrate into his former employer’s offerings, as well as his responsibilities to his investors as a CEO. 

“My investors would be furious if they heard me say this, but I would despise seeing someone purchase us and then limit the technology,” he expressed. “Like, ‘Alright, this is ours now, and no one else can benefit from it.’”

Leave a Reply