Paper page - Verifiable Rewards Beyond Math and Code: Lightweight Corpus-Grounded Process Supervision for Factual Question Answering
…Shicheng Fan , Haochang Hao , , , , Abstract CorVer, a corpus-grounded reward mechanism, enhances factual accuracy in question answering by providing efficient sentence-level feedback through Wikipedia co-occurrence statistics, outperforming neural verifiers while…