Paper page - Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding
…Hayate Iso , , , , , , , , , , , , , , , , , Abstract Speculative decoding accelerates RL post-training by preserving output distributions while improving rollout throughput, with projected 2.5x speedup at large scales. AI-generated summary RL post-training of…