Anthropic, a leading player in the AI industry, has announced a new program aimed at funding the development of innovative benchmarks to evaluate the performance and impact of AI models, including generative models like its own Claude.
This initiative, unveiled on Monday, aims to provide financial support to third-party organizations capable of effectively measuring advanced capabilities in AI models.
Addressing the AI Benchmarking Problem
Challenges with Current Benchmarks
As Anthropic highlights, the AI industry faces significant challenges with current benchmarking standards.
Many of the most commonly cited benchmarks fail to capture how the average person uses AI systems and may not accurately measure the capabilities of modern generative AI. This gap necessitates the development of more robust and relevant benchmarks.
Focus on Safety and Societal Impact
Anthropic’s program seeks to create benchmarks that focus on AI security and societal implications. The company is calling for tests that assess AI models’ ability to perform tasks such as executing cyberattacks, enhancing weapons of mass destruction, and manipulating or deceiving people through deepfakes or misinformation.
For risks related to national security and defense, Anthropic aims to develop an early warning system to identify and assess these threats.
Encouraging Comprehensive AI Evaluations
Supporting Research and Development
The new program also supports research into benchmarks that probe AI’s potential for aiding scientific study, conversing in multiple languages, mitigating biases, and self-censoring toxicity.
Anthropic envisions platforms where subject-matter experts can develop their evaluations and conduct large-scale trials involving thousands of users.
Program Management and Funding Options
To facilitate this ambitious initiative, Anthropic has hired a full-time coordinator and may purchase or expand promising projects.
The company offers a range of funding options tailored to the needs and stages of each project, with opportunities for teams to interact directly with Anthropic’s domain experts.
Industry Implications and Skepticism
Commercial Ambitions vs. Trust
While Anthropic’s effort to support new AI benchmarks is commendable, it comes with a caveat. The company’s commercial ambitions in the competitive AI market may influence the objectivity of the benchmarks it funds.
Anthropic is transparent about wanting certain evaluations to align with its AI safety classifications, which might compel applicants to adhere to definitions of “safe” or “risky” AI that they do not fully agree with.
Broader AI Community Concerns
Debate on AI Risks
A segment of the AI community is likely to question Anthropic’s focus on “catastrophic” and “deceptive” AI risks, such as those involving nuclear weapons. Many experts argue that there is little evidence suggesting AI will develop world-ending capabilities in the foreseeable future.
These experts contend that emphasizing superintelligence diverts attention from more immediate regulatory issues, like AI’s hallucinatory tendencies.
Catalyst for Industry Standards
In its blog post, Anthropic expresses hope that its program will catalyze the establishment of comprehensive AI evaluation as an industry standard.
While this mission aligns with many open, corporate-unaffiliated efforts to improve AI benchmarks, it remains to be seen whether these independent initiatives will collaborate with an AI vendor primarily accountable to its shareholders.
Future of AI Benchmarking
Anthropic’s initiative represents a significant step toward addressing the shortcomings of current AI benchmarks and ensuring that AI models are evaluated rigorously and comprehensively.
The success of this program will depend on its ability to balance commercial interests with the broader goal of advancing AI safety and effectiveness. As the AI landscape continues to evolve, the development of robust benchmarks will be crucial in guiding the responsible deployment of AI technologies.