Paper page - OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification
…Second, the token-level logit signal itself is brittle, depending on a narrow overlap of plausible next tokens between teacher and student, and prone to amplifying degenerate patterns such as repetition loops…