Paper page - Continual Harness: Online Adaptation for Self-Improving Foundation Agents
…relabeled by a frontier teacher and used to update the model, drives sustained in-game milestone progress on Pokemon Red without resetting the environment between training iterations. View arXiv page View PDF…