Claude Opus 4.6
… 3 This translates into Claude Opus 4.6 obtaining a higher score than GPT-5.2 on this eval approximately 70% of the time where 50% of the time would have implied parity in the scores . For GPT-5.2 and Gemini 3 Pro models, we compared the best reported model version in the charts and table. …