Followed topics

Search

Showing top 1 result for "comparisons with OpenAI"

Building The Imperfect Beast

… The jump in performance of Mythos over Opus 4.6 on Humanity’s Last Exam perhaps not ironic, and not multiple choice but including problems models cannot solve and domain experts – meaning people – struggle with but can solve and on the Charxiv reasoning benchmark checks how models reason from chart… …

Apr 13, 2026 · Timothy Prickett Morgan