Paper page - Guiding LLM Post-training Data Engineering with Model Internals from Sparse Autoencoders
…Yi Jing , , , , , , Xiaozhi Wang Abstract SAERL uses Sparse Autoencoder-derived signals from model internals to enhance LLM reinforcement learning through diversity control, difficulty-aware curriculum learning, and quality-based data filtering. AI…