Search

Showing top 130 results for "LLMs"

LLMs

Large language models are machine learning models trained to predict and generate text and other language-based outputs.

368 articles indexed Last updated 15h ago See topic hub

Pruning and Distilling LLMs Using NVIDIA TensorRT Model Optimizer | NVIDIA Technical Blog

… Learn more Large language models LLMs have set a high bar in natural language processing NLP tasks such as coding, reasoning, and math. …

Oct 7, 2025 · Max Xu

Build Next-Gen Physical AI with Edge‑First LLMs for Autonomous Vehicles and Robotics | NVIDIA Technical Blog

… NVIDIA TensorRT Edge-LLM , a high-performance C++ inference runtime for LLMs and vision language models VLMs on embedded platforms, is designed to overcome these challenges. …

Mar 12, 2026 · Lin Chai

Post-Training Quantization of LLMs with NVIDIA NeMo and NVIDIA TensorRT Model Optimizer | NVIDIA Technical Blog

… Learn more As large language models LLMs are becoming even bigger, it is increasingly important to provide easy-to-use and efficient deployment paths because the cost of serving such LLMs is becoming higher. …

Sep 10, 2024 · Jan Lasek

Winning a Kaggle Competition with Generative AI–Assisted Coding | NVIDIA Technical Blog

… To generate new ideas, we can either propose them or have LLMs generate them. There are many effective ways to encourage LLMs to generate ideas, such as: Ask LLMs to find and read research papers on the topic. Ask LLMs to read forums and publicly shared code on the topic. …

Apr 23, 2026 · Chris Deotte

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance | NVIDIA Technical Blog

… Learn more Large language models LLMs are revolutionizing the financial trading landscape by enabling sophisticated analysis of vast amounts of unstructured data to generate actionable trading insights. …

May 27, 2026 · Dan Blanaru

NVIDIA AI-Q 및 LangChain을 활용한 기업용 검색 딥 에이전트 구축 가이드

Mar 25, 2026 · Sean Lopp

LLM Inference Benchmarking: How Much Does Your LLM Inference Cost? | NVIDIA Technical Blog

… He has contributed to production applications of LLMs covering RAG systems, optimization of inference servers, pretraining of LLMs from scratch, custom evaluation of LLMs, or quantization using FP8 formats. …

Jun 18, 2025 · Vinh Nguyen

Accelerating Long-Context Inference with Skip Softmax in NVIDIA TensorRT LLM | NVIDIA Technical Blog

… Learn more For machine learning engineers deploying LLMs at scale, the equation is familiar and unforgiving: as context length increases, attention computation costs explode. …

Dec 16, 2025 · Laikh Tewari

Automating Inference Optimizations with NVIDIA TensorRT LLM AutoDeploy | NVIDIA Technical Blog

Developer Tools & Techniques Automating Inference Optimizations with NVIDIA TensorRT LLM AutoDeploy Feb 09, 2026 By Lucas Liebenwein , Suyog Gupta and Laikh Tewari Discuss 0 Discuss 0 L T F R E NVIDIA TensorRT LLM enables developers to build high-performance inference engines for large language m… …

Feb 9, 2026 · Lucas Liebenwein

How Small Language Models Are Key to Scalable Agentic AI | NVIDIA Technical Blog

… The new role of LLMs in a heterogeneous AI architecture This doesn’t mean LLMs are obsolete. …

Aug 29, 2025 · Peter Belcak

Followed topics