Paper page - MARBLE: Multi-Aspect Reward Balance for Diffusion RL
…No manual reward weighting, no multi-stage curriculum, and at near single-reward training cost. To the best of our knowledge, we are the first to address reward balancing problem in multi…