Making Softmax More Efficient with NVIDIA Blackwell Ultra | NVIDIA Technical Blog
…Alleviating the softmax bottleneck in Blackwell Ultra By doubling the throughput of the SFU for exponentials in the Blackwell Ultra architecture, NVIDIA is alleviating this bottleneck and is allowing for a more…
