Search: computer networking

Boosting Llama 3.1 405B Performance up to 1.44x with NVIDIA TensorRT Model Optimizer on NVIDIA H200 GPUs | NVIDIA Technical Blog

…The latter technique pre-computes scaling factors at compile time, rather than at run time, reducing inference compute overhead. This scaling is applied at per-tensor granularity. Table 1 shows maximum throughput…

Aug 28, 2024 · Anjali Shah

Implementing Falcon-H1 Hybrid Architecture in NVIDIA Megatron Core | NVIDIA Technical Blog

…While classical μP is rooted in neural network theory to enable effortless hyperparameter transfer from a base model size to larger models, Falcon-H1 extends this by tuning μP multipliers themselves. This…

Mar 9, 2026 · Mireille Fares

Validate Kubernetes for GPU Infrastructure with Layered, Reproducible Recipes | NVIDIA Technical Blog

…Recipes also carry constraints (minimum Kubernetes version, required OS, kernel version) and a computed deployment order based on component dependencies. Every recipe is validated against real clusters and reproducible across environments. You…

Mar 12, 2026 · Mark Chmarny

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance | NVIDIA Technical Blog

…The HPE ProLiant Compute DL384 Gen12 , powered by the NVIDIA GH200 Grace Hopper Superchip , provides an efficient single-server solution. To see detailed results, refer to the STAC report on HPE ProLiant…

May 27, 2026 · Dan Blanaru

Revolutionizing AI-Driven Material Discovery Using NVIDIA ALCHEMI | NVIDIA Technical Blog

…This contrasts with physics-informed neural networks (PINNs) that embed knowledge from physics equations and have relevance primarily in computational fluid dynamics. Geometry relaxation In many chemical and material discovery workflows, a…

Nov 18, 2024 · Wen Jie Ong

NVIDIA NVbandwidth: Your Essential Tool for Measuring GPU Interconnect and Memory Performance | NVIDIA Technical Blog

Apr 14, 2026 · Eva Sitaridi

Followed topics

Search

Boosting Llama 3.1 405B Performance up to 1.44x with NVIDIA TensorRT Model Optimizer on NVIDIA H200 GPUs | NVIDIA Technical Blog

Implementing Falcon-H1 Hybrid Architecture in NVIDIA Megatron Core | NVIDIA Technical Blog

Validate Kubernetes for GPU Infrastructure with Layered, Reproducible Recipes | NVIDIA Technical Blog

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance | NVIDIA Technical Blog

Revolutionizing AI-Driven Material Discovery Using NVIDIA ALCHEMI | NVIDIA Technical Blog

NVIDIA NVbandwidth: Your Essential Tool for Measuring GPU Interconnect and Memory Performance | NVIDIA Technical Blog

Updating Classifier Evasion for Vision Language Models | NVIDIA Technical Blog

NVIDIA RTX Innovations Are Powering the Next Era of Game Development | NVIDIA Technical Blog

Improving Bash Generation in Small Language Models with Grammar-Constrained Decoding | NVIDIA Technical Blog

How to Eliminate Pipeline Friction in AI Model Serving | NVIDIA Technical Blog