Scaling Autonomous AI Agents and Workloads with NVIDIA DGX Spark | NVIDIA Technical Blog
… Analysis of the roofline model suggests that the kernel scales effectively relative to the respective roofline, demonstrating that Tile IR is a viable option to scale workloads. The kernel considered is the attention decode kernel and the kernel is optimized using Tile IR. …