Paper page - DeltaRubric: Generative Multimodal Reward Modeling via Joint Planning and Verification
… DeltaRubric operates in two steps: acting first as a Disagreement Planner, the model generates a neutral, instance-specific verification checklist. Transitioning into a Checklist Verifier, it executes these self-generated checks against the image and question to produce the final grounded judgment. …