Anthropic Mythos model can find and exploit 0-days
…Mythos is markedly different from Claude Opus 4.6, which Anthropic only recently said was not very skilled at developing working exploit code. Where Opus 4.6 managed an exploit development success…
…Mythos is markedly different from Claude Opus 4.6, which Anthropic only recently said was not very skilled at developing working exploit code. Where Opus 4.6 managed an exploit development success…
…67.7% Claude Opus 4.6: 66.6% GPT-5.2 Codex: 62.5% Claude Opus 4.5: 61.9% Gemini 3 Pro Preview: 60.4% Claude Sonnet 4.6: 58.4…
…I used Opus 4.6 here because, at the time of testing, Opus 4.7 hadn’t been released yet. What I found surprised me in both directions. One of them is…
…JRickey used Opus 4.6, Opus 4.7, and GPT 5.5 as the sole contributors in its development. This was mainly done as a proof of concept to show that AI…
Claude Code Degraded Before Opus 4.8 Release
Is anyone else seeing a massive performance drop in Opus 4.8 since release?? It used to be acceptable, but the enshitification has definitely happened. It’s basically been lobotomized, and we’re talking amateur backyard …
As an anthropic fan boy(check my prev. comments), this is the first opus release where I feel like the model is just not pleasant to talk to not to mention untrustworthy.The two examples for me where I lost confidence in…
Anthropic tests every model before it releases them. Opus 4.8 might be worse uncertain benchmarks compared to 4.7, and there probably are some genuine gripes with the model, but the amount of “Opus 4.8 is all around wors…
it's been a month tagging sama openai tibo on X for this issueand no one seem to replyand eveyone is falttering codex, im sure im not the only one facing thisi switched to codex from claude since it was better consume le…
…to improve security. Opus 4.6 is currently far better at identifying and fixing vulnerabilities than at exploiting them. This gives defenders the advantage. And with the recent release of Claude Code…
…To help combat this problem, the government-sponsored Estonian Language Institute (ELI) has released a new “Propaganda Resistance” benchmark ranking dozens of LLMs on their ability to avoid “tak[ing] positions on…
…They often even prefer it to our smartest model from November 2025, Claude Opus 4.5. Performance that would have previously required reaching for an Opus-class model—including on real-world…
…View Bio Most Popular Blue Origin’s New Glenn rocket explodes during testing in Florida Anthropic releases Opus 4.8 with new ‘dynamic workflow’ tool Meta launches Instagram, Facebook, and WhatsApp subscriptions…
…To support this, we recently released Claude Security , a product that uses our latest public frontier models, like Claude Opus 4.8, to scan codebases and suggest patches. We're also releasing…
…As we announced in our changelog , Opus 4.5 and Opus 4.6 will be removed from Pro+. These changes are necessary to ensure we can serve existing customers with a predictable…