Search: integrations & automations

How to Build a Voice Agent with RAG and Safety Guardrails | NVIDIA Technical Blog

…from transformers import AutoModel model = AutoModel.from_pretrained( "nvidia/llama-nemotron-embed-vl-1b-v2", trust_remote_code=True, device_map="auto" ).eval() # Embed queries and documents query_embedding = model.encode_queries…

Jan 5, 2026 · Chris Alexiuk

Develop Physical AI Reasoning, World, and Action Models with NVIDIA Cosmos 3 | NVIDIA Technical Blog

…As SOTA video generation models saturate existing automated leaderboards, score differences between releases are often too narrow for meaningful comparison. HUE shifts evaluation from subjective grading to objective fact verification, enabling fine…

Jun 1, 2026 · Asawaree Bhide

Building Token‑Metered AI Services on Telco AI Factories | NVIDIA Technical Blog

…Behind the scenes, the platform provisions inference endpoints and meters usage in input and output tokens, API calls, or workflow executions, automatically enforcing quotas, rate limits, and SLAs. Together, TaaS enabled by…

May 21, 2026 · Waleed Badr

Extract More Kernel Performance with NVIDIA CompileIQ Auto-Tuning | NVIDIA Technical Blog

…CompileIQ’s evolutionary search finds that combination automatically. The team that hit a wall after exhausting every optimization lever they knew now has a new lever with CompileIQ—the compiler itself. CompileIQ…

May 26, 2026 · Aditya Srikanth

Validate Kubernetes for GPU Infrastructure with Layered, Reproducible Recipes | NVIDIA Technical Blog

…At NVIDIA he helps define GPU-accelerated Kubernetes and health-automation patterns for large-scale AI infrastructure, influencing how cloud providers and their customers run production GPU workloads reliably at scale. View…

Mar 12, 2026 · Mark Chmarny

Achieving Single-Digit Microsecond Latency Inference for Capital Markets | NVIDIA Technical Blog

…They validate that a platform can meet strict latency budgets for demanding use cases like high-frequency market making, short-term price prediction, and automated hedging. Furthermore, because the benchmark is designed…

Apr 2, 2026 · Nikolay Markovskiy

Establishing a Scalable Sparse Ecosystem with the Universal Sparse Tensor | NVIDIA Technical Blog

…Stay tuned for the integration of the UST with polymorphic operations that dispatch to optimized libraries or automatically generated code. This approach establishes a clean, easy-to-use, and scalable sparse ecosystem…

Jan 30, 2026

Nsight Systems - Get Started

…to automatically add the following to your Pod: an init container, volumes containing Nsight Systems, its configurations, environment variables, and security context. Download NVIDIA Nsight Tools Sidecar Injector JupyterLab integration: Download NVIDIA…

Building the AI Grid with NVIDIA: Orchestrating Intelligence Everywhere | NVIDIA Technical Blog

…This includes automatic speech recognition (ASR) and text-to-speech (TTS). Prefill and decode: Time the model spends processing the prompt (prefill) and generating the first token (decode) Voice activity detection (VAD…

Mar 17, 2026 · Sree Sankar

NVIDIA Nemotron AI Models

…How to Build a Voice-Powered RAG Agent Using New Nemotron Models Get a step-by-step guide on how to build a voice-powered RAG agent by integrating Nemotron models for…

Followed topics