Search

Showing top 130 results for "LLMs"

LLMs

Large language models are machine learning models trained to predict and generate text and other language-based outputs.

373 articles indexed Last updated just now See topic hub

Model Quantization: Post-Training Quantization Using NVIDIA Model Optimizer | NVIDIA Technical Blog

…Its vision encoder serves as the visual backbone in multimodal LLMs, such as LLaVA, and open-vocabulary perception models, such as OWL-ViT. Successors such as OpenCLIP and SigLIP scale the data…

May 7, 2026 · Ruixiang Wang

Data Center Deep Learning Product Performance Hub

…This is enabled by deep co-design across NVIDIA Blackwell, NVLink™, and NVLink Switch for scale-out; NVFP4 for low-precision accuracy; and NVIDIA Dynamo and TensorRT™ LLM for speed and flexibility…

Building NVIDIA Nemotron 3 Agents for Reasoning, Multimodal RAG, Voice, and Safety | NVIDIA Technical Blog

…Natural conversations with Nemotron 3 VoiceChat Traditional voice AI relies on cascaded pipelines, automatic speech recognition (ASR), a large language model (LLM), and text-to-speech (TTS)—all of which introduce latency…

Mar 24, 2026 · Chintan Patel

How to Minimize Game Runtime Inference Costs with Coding Agents | NVIDIA Technical Blog

…Trapping the ghost Andrej Karpathy, a founding member of OpenAI, likens working with large language models (LLMs) to summoning ghosts , an apt metaphor for LLM agents, especially ones that write code. Many…

Mar 3, 2026 · Brandon Rowlett

NVIDIA Holoscan

…STT and LLMs Go to GitHub Repo Run Instructions SAM2: Segment Anything in Images and Videos This application demonstrates how to run SAM2 models on a live video feed with the possibility…

Mitigating Indirect AGENTS.md Injection Attacks in Agentic Environments | NVIDIA Technical Blog

…Consider using the NVIDIA garak LLM vulnerability scanner to evaluate models for known prompt injection weaknesses, and apply NVIDIA NeMo Guardrails to filter and protect LLM inputs and outputs. Learn more This…

Apr 20, 2026 · Daniel Teixeira

Make Sense of Video Analytics by Integrating NVIDIA AI Blueprints | NVIDIA Technical Blog

…Returned context is inserted into the enrichment prompt set in the tunable VECTOR_RAG_ENRICHMENT_PROMPT before LLM generation. The tunable enrichment prompt used in the nutritional example is pictured below. Here…

Nov 3, 2025 · Ilyas Bankole-Hameed

Extract More Kernel Performance with NVIDIA CompileIQ Auto-Tuning | NVIDIA Technical Blog

…It targets critical kernel hotspots in workloads like LLM inference, where small code sections dominate compute time, enabling fractional performance gains to yield significant overall throughput improvements. CompileIQ supports multi-objective optimization…

May 26, 2026 · Aditya Srikanth

NVIDIA NeMo Retriever

…Pass top results to Nemotron LLM to produce grounded, contextually relevant responses. Introductory Resources Learn more about building an intelligent document processing pipeline with Nemotron. Nemotron Labs Blog Learn how AI agents…

NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents | NVIDIA Technical Blog

…SGLang , TRT-LLM , vLLM Cloud service providers: Amazon SageMaker JumpStart , Google Cloud, Microsoft Foundry , Oracle Cloud Inference service providers: Baseten , DeepInfra, Eigen AI , fal (ASR), Fireworks AI, FriendliAI, Modal , ModelScope , Ollama cloud…

Jun 4, 2026 · Chris Alexiuk

Followed topics