Scaling Autonomous AI Agents and Workloads with NVIDIA DGX Spark | NVIDIA Technical Blog
… The kernel considered is the attention decode kernel and the kernel is optimized using Tile IR. Performance scaling and optimization headroom In Figure 1, the vertical positioning of the data points on the y-axis confirms that the kernel achieves higher hardware utilization on NVIDIA B200. …