Search: AI safety pause

Trustworthy agents in practice

… We tackle this from multiple angles during Claude’s training. First, we construct training scenarios that place Claude in ambiguous situations, and then reinforce Claude’s choice to pause, rather than to assume. …

Apr 9, 2026

Measuring AI agent autonomy in practice

… Model developers should consider training models to recognize their own uncertainty. Training models to recognize their own uncertainty and surface issues to humans proactively is an important safety property that complements external safeguards like human approval flows and access restrictions. …

Feb 18, 2026

Widening the conversation on frontier AI

… This raises questions about how the character of an AI system should be shaped: What does it mean for an AI to be good? Which traits and behaviors should it display, and under what circumstances? …

May 19, 2026

Building Effective AI Agents

… Agents can then pause for human feedback at checkpoints or when encountering blockers. The task often terminates upon completion, but it’s also common to include stopping conditions such as a maximum number of iterations to maintain control. …

Dec 19, 2024

Followed topics

Trustworthy agents in practice

Measuring AI agent autonomy in practice

Widening the conversation on frontier AI

Building Effective AI Agents