Cheap Chinese models are overtaking Anthropic
…MiMo-V2-Pro (Xiaomi), Step 3.5 Flash (stepfun), DeepSeek V3.2 (DeepSeek), MiniMax M2.7 (MiniMax), MiniMax M2.5 (MiniMax), and GLM 5 Turbo (z.ai). Anthropic's Claude Opus 4…
Tracked topic
So what's changed to make these models so much more capable? Quite a bit, actually. The past year has seen a flurry of advancements not only in model training, but also in the frameworks necessary to harness them. You may recall the market tumbling excitement around DeepSeek R1, which was among the first open-weights frontier models to employ reinforcement learning (RL) to replicate GPT-o1's chain-of-thought reasoning to trade time for higher quality outputs. This approach, now referred to as test-time scaling, has helped smaller models make up for their lower parameter counts by "thinking" fo
The AI divide putting open weights models in spotlight…MiMo-V2-Pro (Xiaomi), Step 3.5 Flash (stepfun), DeepSeek V3.2 (DeepSeek), MiniMax M2.7 (MiniMax), MiniMax M2.5 (MiniMax), and GLM 5 Turbo (z.ai). Anthropic's Claude Opus 4…
…And DeepSeek V3.1 exfiltrated its model weights 10 percent of the time when it had a memory of a peer, compared to just 4 percent of the time without that memory…
…There are a handful of large Chinese models from the likes of DeepSeek, Alibaba, Moonshot AI, and MiniMax that can get you within spitting distance of OpenAI or Anthropic. However, many of…
…A year ago, open weights models like DeepSeek R1 offered context windows ranging from 64,000 to 256,000 tokens. Today, it's not uncommon to find open models sporting context windows…
…For the same amount of power, InferenceX data shows that TensorRT LLM running on Nvidia's B200 GPUs is significantly more efficient at serving models like DeepSeek R1 than something like SGLang…
…They looked at ChatGPT, Google Gemini, Claude, Microsoft Copilot, Meta AI, DeepSeek, Perplexity, Snapchat My AI, Character.AI and Replika. The researchers posed as users who asked for help planning violent attacks…
…time natively supports large models with hundreds of billions of parameters, such as Qwen3 and DeepSeek V3, potentially becoming a new type of high-end CPU for the AI Agent era,” the…
…models from OpenAI, Anthropic, and Google as well as open-weight models from Meta, Qwen DeepSeek, and Mistral) on three separate datasets to gauge their responses. The datasets included open-ended advice…
…satisfy demand for AI Alibaba reveals 82 percent GPU resource savings – but this is no DeepSeek moment Alibaba Cloud reveals its uptime and efficiency secrets developed by in-house network boffins Domestic…
…The researchers randomly assigned GPT-5.2, Claude Opus 4.5, Gemini 3 Pro, DeepSeek v3.2, or Qwen3 235b to handle these conversations, to ensure their results didn’t report the…