Leading Inference Providers Achieve Lowest Token Cost With Open Source Models on NVIDIA Blackwell
…With Fireworks’ Blackwell-optimized inference stack, Sentient achieved 25-50% better cost efficiency compared with its previous Hopper-based deployment. This higher throughput per GPU allowed the company to serve significantly more…