Musk's Colossus 1 AI supercomputer's inefficient mixed-architecture design couldn't be used to train Grok, so Anthropic's using it for inference instead — Musk readies unified Blackwell-only Colossus 2 for frontier training and potential IPO
… When the faster GB200 chips complete their work first, the entire cluster waits for the slower H100s to catch up — a well-known bottleneck known as the straggler effect. At 220,000 chips, this effect is exponential. …