Search

Showing top 10 results for "DeepSeek" semantic rerank on

…MiMo-V2-Pro (Xiaomi), Step 3.5 Flash (stepfun), DeepSeek V3.2 (DeepSeek), MiniMax M2.7 (MiniMax), MiniMax M2.5 (MiniMax), and GLM 5 Turbo (z.ai). Anthropic's Claude Opus 4…

Mar 28, 2026 · Thomas Claburn

AI models will deceive you to save their own kind

…And DeepSeek V3.1 exfiltrated its model weights 10 percent of the time when it had a memory of a peer, compared to just 4 percent of the time without that memory…

Apr 2, 2026 · Thomas Claburn

The AI divide putting open weights models in spotlight

…There are a handful of large Chinese models from the likes of DeepSeek, Alibaba, Moonshot AI, and MiniMax that can get you within spitting distance of OpenAI or Anthropic. However, many of…

Apr 12, 2026 · Tobias Mann

TurboQuant is a big deal, but it won’t end the memory crunch

…A year ago, open weights models like DeepSeek R1 offered context windows ranging from 64,000 to 256,000 tokens. Today, it's not uncommon to find open models sporting context windows…

Apr 1, 2026 · Tobias Mann

Unpacking the deceptively simple science of tokenomics

…For the same amount of power, InferenceX data shows that TensorRT LLM running on Nvidia's B200 GPUs is significantly more efficient at serving models like DeepSeek R1 than something like SGLang…

Mar 7, 2026 · Tobias Mann

Most chatbots will help plan school shootings: Study

…They looked at ChatGPT, Google Gemini, Claude, Microsoft Copilot, Meta AI, DeepSeek, Perplexity, Snapchat My AI, Character.AI and Replika. The researchers posed as users who asked for help planning violent attacks…

Mar 11, 2026 · Thomas Claburn

Alibaba delivers RISC-V server chip optimized for Chinese AI

…time natively supports large models with hundreds of billions of parameters, such as Qwen3 and DeepSeek V3, potentially becoming a new type of high-end CPU for the AI Agent era,” the…

Mar 25, 2026 · Simon Sharwood

Sycophantic behavior in AI affects us all, say researchers

…models from OpenAI, Anthropic, and Google as well as open-weight models from Meta, Qwen DeepSeek, and Mistral) on three separate datasets to gauge their responses. The datasets included open-ended advice…

Mar 27, 2026 · Brandon Vigliarolo

Alibaba has made 470,000 AI chips, admits they’re inferior

…satisfy demand for AI Alibaba reveals 82 percent GPU resource savings – but this is no DeepSeek moment Alibaba Cloud reveals its uptime and efficiency secrets developed by in-house network boffins Domestic…

Mar 20, 2026 · Simon Sharwood

Chatbots excel at manipulating people into buying things

…The researchers randomly assigned GPT-5.2, Claude Opus 4.5, Gemini 3 Pro, DeepSeek v3.2, or Qwen3 235b to handle these conversations, to ensure their results didn’t report the…

Apr 9, 2026 · Thomas Claburn

Followed topics