Paper page - RewardHarness: Self-Evolving Agentic Post-Training
…Given a source image, candidate edited images, and an editing instruction, an Orchestrator selects the most relevant subset of tools and skills from the maintained library, and a frozen Sub-Agent uses…