Search: coding improvements

Paper page - ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation

…Our code is available at https://github.com/hongruhou89/ProRL. View arXiv page View PDF GitHub 40 Add to collection Community Standard policy gradients are fundamentally broken for proactive recommendation. ProRL fixes…

May 28, 2026

Paper page - Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction

…Through a closed-loop run--verify--reflect process , the framework jointly improves decomposition and execution over time via persistent, human-readable external memory , with self-evolving updates to each single-agent. During…

May 4, 2026

Paper page - EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents

…On LoCoMo, EvolveMem outperforms the strongest baseline by 25.7% relative and achieves a 78.0% relative improvement over the minimal baseline. On MemBench, EvolveMem exceeds the strongest baseline by 18.9…

May 15, 2026

Paper page - Retrieval from Within: An Intrinsic Capability of Attention-Based Models

…Is there a source code available? Get this paper in your agent: hf papers read 2605.05806 Don't have the latest CLI? curl -LsSf https://hf.co/cli/install.sh | bash…

May 14, 2026

Paper page - CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing

…Furthermore, improvements from model scaling quickly saturate, strong general reasoning does not reliably translate to creative affordance discovery, and common inference-time strategies such as Chain-of-Thought yield limited gains. These…

May 7, 2026

Paper page - Turning Drift into Constraint: Robust Reasoning Alignment in Non-Stationary Environments

…Xiaoyu Yang , , , Abstract A novel framework called Autonomous Preference Optimization (APO) is proposed to address reasoning alignment challenges in multi-modal large language models under concept drift conditions, achieving improved robustness and…

May 7, 2026

Paper page - Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation

…https://huggingface.co/zhuhz22/Causal-Forcing And the full-stack open-source code: https://github.com/thu-ml/Causal-Forcing We release 2-step frame-wise AR model with 50% latency and…

May 15, 2026

Followed topics

Search