Tuesday, October 7, 2025

Accelerating AI Safety Research: An Open-Source Auditing Tool from Anthropic

Unlock AI Safety with Petri: Your Essential Tool for Risk Evaluation

Introducing Petri: an open-source tool designed for researchers to efficiently explore AI model behaviors. It alleviates the burden of manual auditing, allowing you to focus on the most critical insights.

Key Features:

  • Automated Agent Testing: Deploys simulated users and tools to evaluate AI behavior in multi-turn conversations.
  • Broad-Coverage Evaluations: Test 14 frontier models with 111 diverse seed instructions covering risky behaviors like deception and self-preservation.
  • Efficient Scoring: Utilizes judges to score conversations based on safety-relevant dimensions, surfacing concerning outcomes for review.

Why Petri Matters:

  • Combatting AI Risk: As AI systems grow more complex, traditional manual audits fall short. Petri offers a scalable solution to identify misaligned behaviors quickly.
  • Community-Driven Development: Join early adopters like UK AISI to refine metrics and enhance safety evaluations in AI.

Explore how Petri can elevate your AI research. Visit our GitHub page and start testing today! Let’s advance AI safety together—share your thoughts below!

Source link

Share

Table of contents [hide]

Read more

Local News