Paper page - Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding
… View arXiv page View PDF Add to collection Community seeing how much the gains hinge on draft initialization and keeping drafts short, it looks like the method pays off most when the speculative drafts stay close to the current rollout distribution. my question is: how does speculative decoding beh… …