Search: Performance/compatibility

Accelerating Long-Context Inference with Skip Softmax in NVIDIA TensorRT LLM | NVIDIA Technical Blog

…What are the benefits of using Skip Softmax? Skip Softmax offers drop-in compatibility, hardware efficiency, flexibility, and versatility. Unlike approaches that need specific architectural modifications (such as Linear Attention), Skip Softmax…

Dec 16, 2025 · Laikh Tewari

NVIDIA CUDA Profiling Tools Interface (CUPTI) - CUDA Toolkit

…Key benefits include optimized memory usage by eliminating unused fields and padding, improved performance through compact data structures and faster data access, and improved backward compatibility as new fields can be added…

Nsight Systems - Get Started

…See the supported platforms for specifics about combinations of local, remote, and mixed-OS compatibilities. Download: Download Nsight Systems 2026.2.1 Windows Full Version Download for Linux on x86_64 This…

Streaming Tokens and Tools: Multi-Turn Agentic Harness Support in NVIDIA Dynamo | NVIDIA Technical Blog

…Streaming tool calls as typed dispatch events rather than buffering until turn completion enables immediate tool execution and improved harness compatibility, enhancing responsiveness in agent workflows. AI-generated content may summarize information…

May 8, 2026 · Matej Kosec

NVIDIA IGX Thor Powers Industrial, Medical, and Robotics Edge AI Applications | NVIDIA Technical Blog

…SoM, IGX T7000 Board Kit, and Developer Kitsoffering scalable AI performance, industrial-grade hardware, real-time Linux, rich I/O, and compatibility with both NVIDIA Jetson hardware and software stacks. Key features…

Mar 23, 2026 · Suhas Hariharapura Sheshadri

Advancing GPU Programming with the CUDA Tile IR Backend for OpenAI Triton | NVIDIA Technical Blog

…A simple environment variable configuration switches the compilation pipeline from PTX backend to the CUDA Tile IR backend, unlocking improved performance and future-proof architecture compatibility. Triton users will be able to…

Jan 30, 2026 · Jie Xin

NVIDIA NeMo Agent Toolkit

…Compatible with OpenTelemetry and major agent frameworks, the toolkit reduces cost and enhances performance while providing insights to scale from single agents to enterprise-grade digital workforces. Simplify Development Experiment and prototype…

CUDA 13.2 Introduces Enhanced CUDA Tile Support and New Python Features | NVIDIA Technical Blog

…for improved compatibility and feature access. Math library advances feature experimental Grouped GEMM with MXFP8 support in cuBLAS for Blackwell GPUs, and FP64-emulated cuSOLVERD APIs for significant performance gains on INT8…

Mar 9, 2026 · Jonathan Bentz

NVIDIA Streamline SDK 2.4

…Yes, DLSS Frame Generation is compatible with HDR. However, DLSS Frame Generation currently does not support FP16 pixel format and scRGB color space. Q: Why does my performance drop when I enable…

Vulkan Driver Support

…vulkan_glsl compatibility when using GLSL directly Improve GPU texturing performance in some cases Improve vkAllocateDescriptorSets()/vkFreeDescriptorSets() performance in some cases Improve vkCmdBindDescriptorSets() performance in some cases Improve vkCmdCopyImage() performance in some…

Followed topics