Search: timing risk

Teaching Claude why

…Specifically, at the time of Claude 4’s training, the vast majority of our alignment training was standard chat-based Reinforcement Learning from Human Feedback RLHF data that did not include any…

May 8, 2026

Project Fetch: Can Claude train a robot dog?

…Team Claude accomplished more tasks and completed them faster on average—indeed, Team Claude succeeded in about half the time it took Team Claude-less. Only Team Claude made substantial progress toward…

Nov 12, 2025

Project Vend: Can Claude run a small shop? (And why does that matter?)

…Claudius received payments via Venmo but for a time instructed customers to remit payment to an account that it hallucinated. Selling at a loss: In its zeal for responding to customers’ metal…

Jun 27, 2025

Measuring LLMs’ ability to develop exploits

…time-of-release, the performance of models prior to Opus 4.5 follows a log-linear trajectory, with a mean doubling time of 1.1 months. Our models since Opus 4.5…

May 22, 2026

Quantifying infrastructure noise in agentic coding evals

…Two agents with different resource budgets and time limits aren't taking the same test. Eval developers have begun accounting for this. Terminal-Bench 2.0, for instance, specifies recommended CPU and…

Feb 5, 2026

Measuring LLMs' impact on N-day exploits

…This means that a working exploit is often simply a matter of time. Historically, patch diffing has been slow, specialized work, which bought defenders time to roll out their updates widely. The…

Jun 8, 2026

Anthropic Economic Index report: Cadences

…Why does this matter economically? In conversations mapped to higher-wage occupations, Claude produces more (1.34 times as much output per turn), while users engage more (1.53 times as many…

Jun 26, 2026

Introducing Sonnet 4.6

…At the same time, computer use poses risks: malicious actors can attempt to hijack the model by hiding instructions on websites in what’s known as a prompt injection attack. We’ve…

Feb 17, 2026

Emotion concepts and their function in a large language model

…they encode the operative emotional content most relevant to the model’s current or upcoming output, rather than persistently tracking Claude’s emotional state over time. For instance, if Claude writes a…

Apr 2, 2026

The assistant axis: situating and stabilizing the character of large language models

…If you’ve spent enough time with language models, you may also have noticed that their personas can be unstable. Models that are typically helpful and professional can sometimes go “off the…

Jan 19, 2026

Followed topics