Search

Showing top 113 results for "AI cost and tokens"

All sources blogs.nvidia.com 19 wccftech.com 16 developer.nvidia.com 10 techcrunch.com 9 tomshardware.com 8 theregister.com 8 huggingface.co 6 amd.com 5 theverge.com 2 androidauthority.com 2 engadget.com 2 pcworld.com 2

People also ask

What Are the Factors That Lower Token Cost?

Understanding how to optimize token cost requires looking at the equation for calculating cost per million tokens. In this equation, many enterprises evaluating AI infrastructure focus on the numerator: the cost per GPU per hour. For cloud deployments, this is the hourly rate paid to a cloud provider; for on-premises deployments, it’s the effective hourly cost derived from amortizing owned infrastructure. The real key to reducing token cost, however, lies in the denominator: maximizing the delivered token output. That denominator carries two business implications. Minimize token cost: When thi

Rethinking AI TCO: Why Cost per Token Is the Only Metric That Matters

Why Does Cost per Token Matter Much More Than FLOPS per Dollar?

The following data for the DeepSeek-R1 AI model demonstrates the difference between theoretical and actual business outcomes. Looking at compute cost alone, the NVIDIA Blackwell platform appears to cost roughly 2x more than NVIDIA Hopper — but compute cost says nothing about the output that investment buys. An analysis of mere FLOPS per dollar suggests a 2x NVIDIA Blackwell advantage compared with the NVIDIA Hopper architecture. However, the actual outcome is orders of magnitude different: Blackwell delivers more than 50x greater token output per watt than Hopper, resulting in nearly 35x lower

Rethinking AI TCO: Why Cost per Token Is the Only Metric That Matters

Videos

You’re about to feel the AI money squeeze

…AI companies have broken ground on data centers around the world, dedicating billions of dollars with promises of better models, lower costs, and AI for everyone. Even stemming the flow of losses…

Apr 23, 2026 · Hayden Field

AMD Instinct MI355X GPU Sets a New Bar for DeepSeek Inference

…Cost per Token and Fleet Economics Cost per token is where infrastructure decisions become business decisions. The right evaluation uses a transparent ownership model at the interactivity target each application requires. Starting…

Jun 10, 2026 · Ramine Roane

The Many Aspects of Inference Performance

…GB300 (SGLang and TRT-LLM). Source: SemiAnalysis InferenceX™, Mar 7, 2026. To illustrate the impact of software optimization on cost per token : since February, MI355X GPU cost per token has dropped significantly…

May 11, 2026 · AMD AI Group

Is this the dawn of the Tokenpocalypse? | TechCrunch

…Can these AI labs collapse that cost [and] progress the tech enough in a way that it eventually meets in the middle with customers’ appetite for spending? A funny thing to think…

Jun 7, 2026 · Anthony Ha

What Drives AI Inference Profitability?

…NVIDIA’s codesigned, full-stack AI infrastructure is built to deliver the computational demand and help solve for the complexity of inference, while achieving greater efficiency and lowest cost per token. AI…

Apr 23, 2025 · Kyle Aubrey

Nvidia Software Pushes MLPerf Inference Benchmarks To New Highs

…For large language models (LLMs) and reasoning models, the coin of the realm are tokens – Huang last month called them the “new commodity” – and token generation, both the speed and the cost…

Apr 2, 2026 · Jeff Burt

Companies are scrambling to stop employees from maxing out AI budgets with small tasks | TechCrunch

…COO, and CIO level, are still asking the question of whether they’re getting value from what we’re spending on in the context of AI.” The cost of tokens has thrown…

Jun 24, 2026 · Lucas Ropek

Discussions and forums

Hacker News · u/tinyopsstudio · May 26, 2026

Followed topics

Search

People also ask

Videos

You’re about to feel the AI money squeeze

AMD Instinct MI355X GPU Sets a New Bar for DeepSeek Inference

The Many Aspects of Inference Performance

Is this the dawn of the Tokenpocalypse? | TechCrunch

Top stories

Ditching the cloud for local AI — how I use two mini PCs to process millions of tokens a day and save money on costly API fees

OpenAI debuts Jalapeño, its first custom AI chip to cut ChatGPT costs and reduce Nvidia dependency

Microsoft Risks Trump's Ire By Abandoning The Costly OpenAI And Anthropic Models For China-Based DeepSeek's V4 Model For Enterprise Workloads

Tensordyne's 3nm Napier AI Chip Promises 13x Higher Token Throughput Than Blackwell & Blazes Past Rubin With 1000 Tokens/s In Multi-Trillion Parameter Models

What Drives AI Inference Profitability?

Nvidia Software Pushes MLPerf Inference Benchmarks To New Highs

Companies are scrambling to stop employees from maxing out AI budgets with small tasks | TechCrunch

Discussions and forums

Show HN: AI agent token cost calculator for Codex and Claude Code loops

Value for Money Is All You Need

Zai replaced the network architecture running GLM-5.1 inference and the gains are pretty wild

Show HN: Token Usage Meter 12 Providers and Coding Agent

Show HN: Open-source CLI to see your AI coding token usage and compare it

Building for the Rising Complexity of Agentic Systems with Extreme Co-Design | NVIDIA Technical Blog

'What a joke': Github Copilot's new token-based billing spurs consternation among devs | TechCrunch

Dell Launches Local ‘Deskside Agentic AI’ Workstations to Slash Cloud Token Costs