Paper page - Reinforcing Multimodal Reasoning Against Visual Degradation
…We propose ROMA, an RL fine-tuning framework that modifies the optimization dynamics to reinforce reasoning against visual degradation while preserving clean-input performance. A dual-forward-pass strategy uses teacher forcing…