Helping build shared standards for advanced AI
OpenAI is actively contributing to the development of shared standards for advanced AI, particularly through its involvement with the Appia Foundation. This initiative aims to create a critical "trust layer" for AI, enabling standardized assessment criteria across the AI value chain and fostering international cooperation. By establishing interoperable practices and technical understanding, the effort seeks to ensure safe and confident advancement of AI technologies globally.
Advanced AI models hold immense potential for societal benefit, from strengthening cyber defense to accelerating scientific discovery. However, their increasing capabilities also introduce significant safety and security concerns, especially if safeguards are inadequate or understanding of their operation is limited. To harness the benefits of AI safely, robust institutions with the technical and governance capacity to evaluate and secure these systems are essential. These institutions must be capable of responding effectively to the evolving challenges posed by advanced AI.
OpenAI has taken a proactive step in this direction by helping establish the Appia Foundation, hosted by the Linux Foundation. Appia's mission is to develop open, modular specifications that translate international standards into practical assessment criteria for the AI value chain. This initiative aims to build a crucial "trust layer" that allows third parties to verify conformity with established standards, ensuring clearer and more reusable evidence as AI components are developed by different organizations. This shared technical language is vital for fostering trust and cooperation among national and international institutions.
The development of shared standards is a key component of a broader effort to strengthen the institutional frameworks and assessment practices necessary for advanced AI systems. This includes creating a durable U.S. framework, reinforcing the Center for AI Standards and Innovation (CAISI), and implementing a comprehensive government-wide resilience strategy. Recognizing the global nature of AI risks, international cooperation is paramount. Nations must collaborate to develop compatible safety frameworks, establish trusted channels for sharing risk information, and coordinate incident response efforts.
Standards must be built on credible evaluation practices and technical rigor. Our shared playbook for trustworthy third-party evaluations emphasizes transparency in assessing frontier systems, covering aspects such as the system tested, its tool access, evaluation harness, and the methods used to elicit capabilities. These principles have been applied in testing partnerships with entities like US CAISI and UK AISI, leading to tangible improvements in existing AI systems and establishing a foundation for standardized performance checks.
Appia's work extends these efforts by focusing on making these essential practices interoperable across various organizations, jurisdictions, and the entire AI supply chain. OpenAI's involvement in numerous standards bodies and pre-standardization initiatives, including ISO/IEC JTC 1/SC 42 and the Frontier Model Forum, underscores its commitment to translating cutting-edge development insights into open, technically sound practices. The ultimate goal is to provide governments, companies, and independent assessors with universally applicable tools for safe and responsible AI deployment.
Related articles
Evaluating LLM Usage for Efficient and Explainable Numerical and Classified Implicit Sentiment Analysis of Product Desirability
A new paper explores the use of large language models for efficient and understandable implicit sentiment analysis, focusing on product desirability. This research delves into numerical and classified sentiment analysis, offering insights into advanced applications of LLMs.
How GPT-5 helped immunologist Derya Unutmaz solve a 3-year-old mystery
Dr. Derya Unutmaz, an immunologist, leveraged GPT-5 Pro to unravel a three-year-old mystery concerning T-cell development and glucose. The AI model offered a crucial insight into how deoxyglucose impacts an important protein, accelerating Unutmaz's research into cancer and autoimmune diseases.
The AI world is getting ‘loopy’
AI models are taking a significant leap forward with the adoption of "agentic loops," where AI agents continuously prompt each other to improve code and solve complex problems. This approach, though potentially resource-intensive, promises to unlock new levels of autonomous problem-solving and efficiency in AI applications.
