NVIDIA GTC Taipei at COMPUTEX: Live Updates on What’s Next in AI
…Powered by the NVIDIA Blackwell GPU architecture , it delivers up to 2,070 FP4 teraflops of AI performance — 7.5x the compute and 3.5x the energy efficiency of the previous NVIDIA…
NVIDIA doubled Blackwell performance through continuous software optimization, refining kernels, compiler paths, and inference runtimes so the same hardware delivers significantly more useful AI throughput over time. Initial gpt-oss-120b performance on an NVIDIA DGX Blackwell B200 system with the NVIDIA TensorRT LLM library was market-leading, but NVIDIA’s teams and the community have significantly optimized TensorRT LLM for open-source large language models. The TensorRT LLM v1.0 release is a major breakthrough in making large AI models faster and more responsive for everyone. Through advance
Telecommunications ArchivesInferenceMAX v1, a new benchmark from SemiAnalysis released Monday, is the latest to highlight Blackwell’s inference leadership. It runs popular models across leading platforms, measures performance for a wide range of use cases and publishes results anyone can verify. Why do benchmarks like this matter? Because modern AI isn’t just about raw speed — it’s about efficiency and economics at scale. As models shift from one-shot replies to multistep reasoning and tool use, they generate far more tokens per query, dramatically increasing compute demands. NVIDIA’s open-source collaborations with Ope
Telecommunications ArchivesMetrics like tokens per watt, cost per million tokens and TPS/user matter as much as throughput. In fact, for power-limited AI factories, Blackwell delivers 10x throughput per megawatt for mixture-of-experts models compared with the previous generation, which translates into higher token revenue. The cost per token is crucial for evaluating AI model efficiency, directly impacting operational expenses. The NVIDIA Blackwell architecture lowered cost per million tokens by 15x versus the previous generation, leading to substantial savings and fostering wider AI deployment and innovation.
Telecommunications ArchivesInferenceMAX uses the Pareto frontier — a curve that shows the best trade-offs between different factors, such as data center throughput and responsiveness — to map performance. But it’s more than a chart. It reflects how NVIDIA Blackwell balances the full spectrum of production priorities: cost, energy efficiency, throughput and responsiveness. That balance enables the highest ROI across real-world workloads. Systems that optimize for just one mode or scenario may show peak performance in isolation, but the economics of that doesn’t scale. Blackwell’s full-stack design delivers efficiency and
Telecommunications Archives…Powered by the NVIDIA Blackwell GPU architecture , it delivers up to 2,070 FP4 teraflops of AI performance — 7.5x the compute and 3.5x the energy efficiency of the previous NVIDIA…
…By simulating power, cooling and controls in Omniverse, Schneider enables operators to optimize performance per watt, validate designs before buildout and operate AI factories more efficiently and predictably at scale. Vertiv outlined…
…performance, scale and reliability in a single platform engineered through extreme codesign to enable AI model builders to launch frontier models faster, minimize training costs and start generating revenue early. Performance: Fastest…
…significantly improving performance voice quality compared with CPUs. Plug in to NVIDIA AI PC on Facebook , Instagram , TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter . Follow…
…the app — from arrival notifications to AI-generated stickers — but Snap is also continuously rolling out behind-the-scenes updates such as performance optimizations and compatibility updates for new operating system versions…
…Luminous Robotics deploys AI-powered robotic systems for fast, low-cost solar-panel installation and maintenance. Roboto AI offers a data-analytics platform that accelerates robot development by managing and analyzing robotics…
…Powered by the NVIDIA Grace Blackwell architecture, with large unified memory and petaflop-level AI performance, these systems give developers new capabilities to develop locally and easily scale to the cloud. Advancing…
…of FP4 performance and 748GB of coherent memory, and is capable of running large AI models up to 1 trillion parameters, making it ideal for developing and running powerful AI agents locally…
…Learn more about how to calculate lowest cost per token and download the NVIDIA guide on Cost-Latency-Performance Optimization for AI Factories . Start building AI factories on NVIDIA’s full-stack…
…and minimizes risk for merchants. “Even fractional improvements like a 0.1% uplift in authorization can translate to massive incremental gross merchandise value and substantial cost reductions,” said Dhruv Ghulati, principal AI…