Meta says you better disclose your AI fakes or it might just pull them

Meta logo on blue background
Illustration by Alex Castro / The Verge

Meta will start labelling AI-generated photos uploaded to Facebook, Instagram, and Threads over the coming months as election season ramps up around the world. The company will also begin punishing users who don’t disclose if a realistic video or piece of audio was made with AI.

Nick Clegg, Meta’s president of global affairs, said in an interview that these steps are meant to “galvanize” the tech industry as AI-generated media becomes increasingly difficult to discern from reality. The White House has pushed hard for companies to watermark AI-generated content. In the meantime, Meta is building tools to detect synthetic media even if its metadata has been altered to obfuscate AI’s role in its creation, according to Clegg.

Meta already applies an “Imagined with AI” watermark to images created with its own Imagine AI generator, and the company will begin doing the same to AI-generated photos made with tools from Google, OpenAI, Microsoft, Adobe, Midjourney, and Shutterstock. Clegg said the industry is farther behind on building standards to identify AI-generated video and audio. and that, while Meta is on high alert for how such media can be used to deceive, the company isn’t going to be able to catch everything on its own.

“For those who are worried about video, audio content being designed to materially deceive the public on a matter of political importance in the run-up to the election, we’re going to be pretty vigilant,” he said. “Do I think that there is a possibility that something may happen where, however quickly it’s detected or quickly labeled, nonetheless we’re somehow accused of having dropped the ball? Yeah, I think that is possible, if not likely.”

An example of a watermarked AI image made with Meta’s free tool showing Barack Obama from the prompt “a president of the United States.”
The Verge
An example of a watermarked AI image made with Meta’s free tool.

Meta has been working with groups like Partnership on AI to build on existing content authenticity initiatives. Adobe recently released the Content Credentials system that puts content provenance information into the metadata of images. And Google extended its SynthID watermark to audio files after releasing it in beta for images.

Clegg said Meta will soon begin requiring that its users disclose when realistic video or audio posts are made with AI. If they don’t, “the range of penalties that will apply will run the full gamut from warnings through to removal” of the offending post, he said.

There are already plenty of examples of viral, AI-generated posts of politicians, but Clegg downplayed the chances of the phenomena overrunning Meta’s platform in an election year. “I think it’s really unlikely that you’re going to get a video or audio which is entirely synthetic of very significant political importance which we don’t get to see pretty quickly,” he said. “I just don’t think that’s the way that it’s going to play out.”

Meta is also starting to internally test the use of large language models (LLMs) trained on its Community Standards, he said, calling it an efficient “triage mechanism” for its tens of thousands of human moderators. “It appears to be a highly effective and rather precise way of ensuring that what is escalated to our human reviewers really is the kind of edge cases for which you want human judgment.”