Search: NVFP4

NVIDIA Nemotron 3 Super 공개 — 에이전틱 추론을 위한 오픈 하이브리드 Mamba-Transformer MoE

…네이티브 NVFP4 사전 학습 : NVIDIA Blackwell에 최적화되어, 메모리 요구사항은 크게 낮추면서 NVIDIA B200 추론 속도를 NVIDIA H100의 FP8 대비 4배까지 끌어올리고, 정확도도 함께 유지합니다. 다환경 강화 학습(RL) : NVIDIA NeMo Gym 과…

May 14, 2026 · Chris Alexiuk

Bringing AI Closer to the Edge and On-Device with Gemma 4 | NVIDIA Technical Blog

…From Blackwell, with NVFP4 quantized checkpoints coming soon, to Jetson platforms, developers can quickly get started deploying these high-accuracy multimodal models, with the flexibility to meet their speed, security, and cost…

Apr 2, 2026 · Anu Srivastava

Building NVIDIA Nemotron 3 Agents for Reasoning, Multimodal RAG, Voice, and Safety | NVIDIA Technical Blog

…Nemotron 3 Super employs a hybrid Mamba-Transformer MoE architecture with NVFP4 precision on Blackwell GPUs, achieving high throughput and efficiency for multi-agent tasks, while Nemotron 3 Content Safety delivers low…

Mar 24, 2026 · Chintan Patel

Streaming Tokens and Tools: Multi-Turn Agentic Harness Support in NVIDIA Dynamo | NVIDIA Technical Blog

…Harness-facing Dynamo settings Our experiments used the newly released nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 model, though the same issues apply across models, reasoning parsers, and tool-call parsers…

May 8, 2026 · Matej Kosec

Introducing Nemotron 3 Super: An Open Hybrid Mamba-Transformer MoE for Agentic Reasoning | NVIDIA Technical Blog

…Native NVFP4 pretraining optimized for NVIDIA Blackwell, significantly cutting memory requirements and speeding up inference by 4x on NVIDIA B200 compared to FP8 on NVIDIA H100, while maintaining accuracy. Multi-environment reinforcement…

Mar 11, 2026 · Chris Alexiuk

Data Center Deep Learning Product Performance Hub

…This is enabled by deep co-design across NVIDIA Blackwell, NVLink™, and NVLink Switch for scale-out; NVFP4 for low-precision accuracy; and NVIDIA Dynamo and TensorRT™ LLM for speed and flexibility…

NVIDIA Nemotron 3 Nano Omni: 단일 오픈 모델로 멀티모달 에이전트 추론을 가속화

…또한 FP8과 NVFP4 양자화 , 효율적인 비디오 샘플링, NVIDIA 최적화 커널을 지원해 예측 가능하고 지연 시간이 낮은 추론을 제공합니다. 여기에 3D 컨볼루션 기반 시공간 처리가 결합되면 워크스테이션부터 데이터센터, 클라우드 배포 환경까지 GPU 전반에서…

May 12, 2026 · Anjali Shah

NVIDIA 技術ブログ

…エージェント型推論向けのオープンハイブリッド Mamba-Transformer MoE Nemotron 3 Super は、高容量の推論モデルにおける典型的な効率と精度のトレードオフを軽減するアーキテクチャ革新を導入しています。 3 MIN READ 2026 年 2 月 6 日 NVFP4 が AI のトレーニングと推論を加速する 3 つの方法 NVIDIA による徹底的な共同設計によって、モデルのトレーニングと推論の両方において、優れた精度で大幅なパフォーマンスの向上が達成が見込めるようになりました。 2 MIN READ…

NVIDIA Jetson でメモリ効率を最大化して大規模なモデルを実行

…重要なポイントが 1 つあるとすれば、適切な量子化の精度を使用することです。 NVFP4、INT4、W4A16 などのフォーマットは、多くの LLM ワークロードで高い精度を維持しながら、メモリとストレージの要件を大幅に削減します。実際のユースケース: Reachy Mini Jetson Mini Assistant これらのメモリ最適化の効果を示すために、Jetson Orin Nano 上で実行されるオンデバイス対話型 AI ロボットである Reachy Mini Jetson Assistant を考えてみましょう。これは…

Apr 20, 2026 · Anshuman Bhat

NVIDIA Achieves Leading Agentic Coding Performance on First Agentic AI Benchmark | NVIDIA Technical Blog

…NVIDIA GB300 NVL72 demonstrates up to 20x higher agentic coding performance. The NVIDIA Vera Rubin platform is expected to extend these gains by leveraging 50 PFLOPs of NVFP4 compute and leveraging the…

Jun 12, 2026 · Eduardo Alvarez

Followed topics

NVFP4