Search

Showing top 99 results for "NVFP4"

…a full-stack architecture built for speed, efficiency and scale: The Blackwell architecture features include: NVFP4 low-precision format for efficiency without loss of accuracy Fifth-generation NVIDIA NVLink that connects 72…

May 7, 2026

NVIDIA Unleashes Dynamic MFG And Big GeForce Now Upgrades At GDC 2026

…Labs' FLUX.2 Klein text-to-image models that are quantized down to FP8 and NVFP4. These models reduce output quality in exchange for massive performance improvements, as you can see in…

Mar 10, 2026 · Zak Killian

The Open Agentic AI World According To Nvidia

…At the show, Nvidia expanded its family of Nemotron 3 open models that it first introduced last year, including Nemotron 3 Ultra, which leverages the vendor’s NVFP4 format on the Blackwell…

Mar 18, 2026 · Jeff Burt

NVIDIA Vera Rubin 플랫폼이 에이전틱 AI의 스케일업 과제를 해결하는 방식

…Vera Rubin NVL72는 랙당 최대 3,600 PFLOPS의 NVFP4 컴퓨트, 20.7 TB HBM4, 1.6 PB/s의 메모리 대역폭을 제공하며 프리필, 롱 컨텍스트 디코드 어텐션, 고동시성 서빙을 담당합니다. 지연 예산이 더욱…

May 21, 2026 · Graham Steele

Discussions and forums

r/LocalLLaMA · u/LLMFan46 · 2w ago

Qwen3.5 35B A3B uncensored heretic Native MTP Preserved is Out Now With the Full 785 MTPs Preserved and Retained, Available in Safetensors, GGUFs. NVFP4, NVFP4 GGUFs and GPTQ-Int4 Formats

Safetensors, llmfan46/Qwen3.5-35B-A3B-uncensored-heretic-v2-Native-MTP-Preserved: https://huggingface.co/llmfan46/Qwen3.5-35B-A3B-uncensored-heretic-v2-Native-MTP-Preserved GGUFs, llmfan46/Qwen3.5-35B-A3B-uncensored-here…

r/LocalLLaMA · u/Kurcide · May 1, 2026

16x Spark Cluster (Build Update)

Build is done. 16 DGX Sparks on the fabric, all hitting line rate. Setup was time consuming but honestly smoother than I expected. Each Spark runs Nvidia’s flavor of Ubuntu out of the box with mostly everything pre insta…

r/nvidia · u/Kurcide · May 1, 2026

My 16x DGX Spark Cluster (HomeLab)

Added a 16x Spark Cluster to my homelab over the last few days. Curious if this is the largest Spark cluster anyone has built. About 2 years ago I had renovated my basement and built a personal lab/datacenter into my off…

r/homelab · u/Kurcide · May 1, 2026

Added a 16x DGX Spark cluster to my Homelab (Build Update)

NVIDIA Dynamo

…This is enabled by deep co-design across NVIDIA Blackwell, NVLink™, and NVLink Switch for scale-out; NVFP4 for low-precision accuracy; and NVIDIA Dynamo and TensorRT™ LLM for speed and flexibility…

How the NVIDIA Vera Rubin Platform is Solving Agentic AI’s Scale-Up Problem | NVIDIA Technical Blog

…Vera Rubin NVL72 delivers up to 3,600 PFLOPS of NVFP4 compute, 20.7 TB of HBM4, and 1.6 PB/s of memory bandwidth per rack, handling prefill, long-context decode…

May 14, 2026 · Graham Steele

Run Local AI Agents with Faster Models and Multi-Node Clustering on NVIDIA DGX Spark | NVIDIA Technical Blog

…DGX Spark agents using Qwen3.6-35B Developers can experience up to 2.6x faster inference with top agentic models like Qwen 3.6 35B on vLLM with NVIDIA’s NVFP4 quantized…

Jun 1, 2026 · Maitri Taneja

NVIDIA CEO Jensen Huang at Dell Technologies World: ‘Demand Is Going Parabolic, Utterly Parabolic’

…M2.7, DeepSeek Pro, DeepSeek-V4, GLM 5.1 and Kimi K2.6 with NVIDIA NVFP4 optimization — are available on the Dell Enterprise Hub on Hugging Face, joining Gemma 4, NVIDIA Nemotron…

May 18, 2026 · NVIDIA Writers

A closer look at Nvidia's Groq-powered LPX rack systems

…The chip doesn't use Nvidia's proprietary NVLink interconnect, it lacks NVFP4 hardware support, and it isn't CUDA-compatible at launch. We can therefore conclude that the $20 billion paid…

Mar 19, 2026 · Tobias Mann

NVIDIA Nemotron AI Models

…Native NVFP4 training, multi‑environment RL alignment, and fully open weights, datasets, recipes, and deployment cookbooks help developers quickly build and deploy customized agentic workflows. Starter Kits Start solving AI challenges by…

Followed topics