Paper page - Recovering Hidden Reward in Diffusion-Based Policies
…it would be neat to see how this scales to truly high-dimensional action spaces or more severe distribution shifts, to test whether the integrability bias consistently improves out-of-sample generalization…