Advancing content provenance for a safer, more transparent AI ecosystem
OpenAI is enhancing content provenance with a multi-layered approach, combining C2PA conformance for metadata with Google’s SynthID watermarking for AI-generated images. This aims to improve trust and transparency, allowing users to verify the origin and authenticity of AI-created content. A new public verification tool will also help users identify images generated by OpenAI tools.
OpenAI is strengthening content provenance through a multi-layered approach to build trust online. This includes making provenance signals recognizable by other tools via C2PA conformance and integrating durable cross-platform SynthID watermarking for images, in partnership with Google. A public verification tool is also being previewed to help users determine if images originated from OpenAI.
OpenAI has been involved in developing and adopting provenance standards since 2024, adding Content Credentials to images from DALL·E 3, ImageGen, and Sora. We also joined the Steering Committee of the Coalition for Content Provenance and Authenticity (C2PA), an industry group that created an open technical standard for content provenance using metadata and cryptographic signatures.
While C2PA metadata provides crucial context about content origin and modifications, it can be stripped or lost. To address this, OpenAI is adopting a multi-layered strategy by incorporating Google DeepMind’s SynthID, which embeds an invisible watermark. This watermark complements C2PA metadata, offering a more resilient provenance signal that can withstand transformations like screenshots.
These two systems, C2PA and SynthID, reinforce each other. C2PA provides detailed context, while SynthID helps preserve a signal when metadata is compromised. Together, they create a more robust provenance system than either method could achieve alone. OpenAI has also previously used visible watermarks in Sora and an audio watermark in Voice Engine.
A new public verification tool allows users to upload an image and check for provenance signals, including Content Credentials and SynthID, to ascertain if it was generated by ChatGPT, the OpenAI API, or Codex. This tool builds on earlier research into image detection classifiers and aims to help users reliably identify AI-generated content. However, no detection method is foolproof; the tool takes a cautious approach if signals are absent, as they can sometimes be stripped.
OpenAI believes that a strong provenance approach requires shared standards, durable watermarking signals, and public verification tools. By supporting Content Credentials, conforming to C2PA, adopting SynthID, and offering public verification, OpenAI aims to foster a more interoperable provenance ecosystem and empower individuals to understand the origin of content they encounter online.
Related articles
Build real agentic apps using CUGA: two dozen working examples on a lightweight harness
CUGA, IBM's open-source Agent Harness, simplifies building agentic applications by handling infrastructure, allowing developers to focus on tools and prompts. It offers pre-assembled components for planning, execution, and state management, significantly reducing development time. CUGA has topped agent benchmarks like AppWorld and WebArena.
OpenAI launches new initiative to help find and patch open source bugs
OpenAI has launched "Patch the Planet," a new initiative in partnership with cybersecurity firm Trail of Bits, to enhance the security of open-source projects. This program aims to assist maintainers in identifying and patching bugs, utilizing OpenAI's AI-powered security tools while reducing the burden on project teams.
PP-OCRv6 on Hugging Face: 50-Language OCR from 1.5M to 34.5M Parameters
Baidu has released PP-OCRv6, an advanced optical character recognition (OCR) model supporting 50 languages. Available on Hugging Face, this version significantly improves accuracy and efficiency across various parameter sizes, from 1.5 million to 34.5 million, marking a substantial leap in multilingual OCR technology.
