Search

Showing top 142 results for "AI training from devs"

Optimizing Communication for Mixture-of-Experts Training with Hybrid Expert Parallel | NVIDIA Technical Blog

…language model training and CUDA kernel development. He has contributed to key features in the optimization of Megatron-Core and Transformer-Engine frameworks. He holds a master's degree from the Institute…

Feb 2, 2026 · Fan Yu

Speeding Up Variable-Length Training with Dynamic Context Parallelism and NVIDIA Megatron Core | NVIDIA Technical Blog

…an AI developer and technology engineer at NVIDIA, specializing in kernel optimization and accelerating LLM training, and contributing to Megatron-Core and Transformer-Engine. He holds a Ph.D. from Peking University…

Jan 28, 2026 · Kunlun Li

Pruning and Distilling LLMs Using NVIDIA TensorRT Model Optimizer | NVIDIA Technical Blog

…in AI training and inference at scale, performance engineering, and end-to-end application deployment. He brings full-stack GPU expertise spanning from chip design, CUDA and kernel-level development to server…

Oct 7, 2025 · Max Xu

Model Quantization: Post-Training Quantization Using NVIDIA Model Optimizer | NVIDIA Technical Blog

…Ruixiang Wang is a senior developer technology engineer for LLMs and generative AI at NVIDIA. His current focus is on optimizing AI workloads, including both training and inference, to achieve speed of…

May 7, 2026 · Ruixiang Wang

Unlock Exascale Performance on NVIDIA GB200 NVL72 with Slurm Topology-Aware Job Scheduling | NVIDIA Technical Blog

…Multiple GB200 NVL72 systems combined in a cluster create hierarchical network topology with large domains of very high networking bandwidth. An AI training job can greatly benefit from the abundant networking bandwidth…

May 21, 2026 · Sachin Lakharia

NVIDIA Data Center Deep Learning Product Performance

AI Training Deploying AI in real-world applications requires training networks to convergence at a specified accuracy. This is the best methodology to test whether AI systems are ready to be deployed…

CUDA Toolkit - Free Tools and Training

…CUDA-X™ Libraries A suite of AI, data science, and math libraries developed to help developers accelerate their applications. Training Self-paced or instructor-led CUDA training courses for developers through the…

Newton Adds Contact-Rich Manipulation and Locomotion Capabilities for Industrial Robotics | NVIDIA Technical Blog

…His recent work focuses on developing high-performance, accurate physics engines to enable synthetic data generation for training physical AI, with a particular emphasis on robotic manipulation workflows. View all posts by…

Mar 16, 2026 · Philipp Reist

Data Center Deep Learning Product Performance Hub

…Latest NVIDIA Data Center Products Training to Convergence Deploying AI in real-world applications requires training networks to convergence at a specified accuracy. This is the best methodology to test whether AI…

Building Telco Reasoning Models for Autonomous Networks with NVIDIA NeMo | NVIDIA Technical Blog

Mar 1, 2026 · Aiden Chang

Followed topics