Paper page - Useful Memories Become Faulty When Continuously Updated by LLMs
… More surprisingly, even when consolidating from ground-truth solutions, GPT-5.4 fails on 54% of a set of ARC-AGI problems it had previously solved without memory. …
… More surprisingly, even when consolidating from ground-truth solutions, GPT-5.4 fails on 54% of a set of ARC-AGI problems it had previously solved without memory. …
…Lungchuan Chen Abstract A memory-augmented transformer architecture called Mela incorporates hierarchical memory modules inspired by human memory consolidation processes, enabling improved long-context language modeling through multi-granularity memory representations. AI…
…evolving memory framework that models memory as a heterogeneous graph and progressively refines its topology through three stages: initial connection formation, feedback-driven refinement , and long-term consolidation . During execution, FluxMem repairs…
…model that integrates 3D scene understanding and future geometry prediction within a single framework. Our approach addresses the distinct requirements of these tasks through synergistic designs. First, a BEV representation consolidates multi…
…Steering Probability Squeezing for Better Exploration in Reinforcement Learning for Large Language Models (2026) MCPO: Mastery-Consolidated Policy Optimization for Large Reasoning Models (2026) Too Correct to Learn: Reinforcement Learning on Saturated…
…Mastery-Consolidated Policy Optimization for Large Reasoning Models (2026) EP-GRPO: Entropy-Progress Aligned Group Relative Policy Optimization with Implicit Process Guidance (2026) PAPO: Stabilizing Rubric Integration Training via Decoupled Advantage Normalization…
… We provide a unified analysis of these two paradigms in consolidating multiple expert capabilities into a single model, identifying capability loss in different ways: mixed RLVR suffers from inter-capability divergence cost, while the pipeline of first training experts and then performing OPD , tho… …
… The following papers were recommended by the Semantic Scholar API Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora 2026 Self-Consolidating Language Models: Continual Knowledge Incorporation from Context 2026 Self-Improvement of Large Language Models: A Te… …
…Unified Post-Training for Large Vision-Language Models (2026) OP-GRPO: Efficient Off-Policy GRPO for Flow-Matching Models (2026) SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting…
… By consolidating this rapidly expanding field into a coherent framework, this survey aims to serve as a foundational reference for future research on large-scale AVI. …