How AI Is Transforming Work at Anthropic
…At the time this data was collected, Claude Sonnet 4 and Claude Opus 4 were the most capable models available, and capabilities have continued to advance. More capable AI brings productivity benefits…
Users will find Opus 4.8 to be a modest but tangible improvement on its predecessor. There’s still more to be done: we’re working on developing and releasing models that provide many of the same capabilities as Opus at a lower cost. Not only that, but we plan to release a new class of model with even higher intelligence than Opus. As part of Project Glasswing, a small number of organizations are currently using Claude Mythos Preview for cybersecurity work. Models of this capability level require stronger cyber safeguards before they can be generally released. We’re making swift progress on dev
Introducing Claude Opus 4.8…At the time this data was collected, Claude Sonnet 4 and Claude Opus 4 were the most capable models available, and capabilities have continued to advance. More capable AI brings productivity benefits…
…Claude already scored well on craft and functionality by default, as the required technical competence tended to come naturally to the model. But on design and originality, Claude often produced outputs that…
…However, even the strongest models did not consistently achieve the level of accuracy and reproducibility required for reliable dataset construction. Claude Sonnet 4, Claude Opus 4.7, Biomni OSS, Edison Analysis, GPT…
…update An early update on what we've learned from Project Glasswing. 2028: Two scenarios for global AI leadership Our views on the AI competition between the US and China. Teaching Claude…
…Turning Claude’s thoughts into text AI models like Claude talk in words but think in numbers. In this study we train Claude to translate its thoughts into human-readable text. Donating…
Claude Mythos Preview is a new general-purpose language model that is strikingly capable at computer security tasks. This post provides technical details for researchers and practitioners who want to understand exactly…
…With frontier models, this bottleneck has largely fallen away. Across 18 recent Firefox security patches, Claude Mythos Preview, our most capable model, built 8 working code-execution exploits autonomously. And on 21…
…Routing easy/common questions to smaller, cost-efficient models like Claude Haiku 4.5 and hard/unusual questions to more capable models like Claude Sonnet 4.5 to optimize for best performance…
In a recent evaluation of AI models’ cyber capabilities, current Claude models can now succeed at multistage attacks on networks with dozens of hosts using only standard, open-source tools, instead of…
…Turning Claude’s thoughts into text AI models like Claude talk in words but think in numbers. In this study we train Claude to translate its thoughts into human-readable text. Donating…