Paper page - Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination
… Prior studies often rely on heuristic seed expansion s for data synthesis, which severely limits both novelty and difficulty. …
… Prior studies often rely on heuristic seed expansion s for data synthesis, which severely limits both novelty and difficulty. …
… To this end, we first compile Themis-CodeRewardBench, a benchmark to evaluate code RMs across five preference dimensions i.e., criteria and eight programming languages, on which we profile 50+ code, math, and general-purpose RMs. …