Search: AI reasoning math

Paper page - EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales

…On three heterogeneous task streams with Qwen3-8B, EVOCHAMBER reaches 63.9% on competition math, 75.7% on code, and 87.1% on multi-domain reasoning, outperforming the best baseline by 32…

May 13, 2026

Paper page - A Single Layer to Explain Them All:Understanding Massive Activations in Large Language Models

…Our approach consistently improves LLM performance across multiple tasks, including instruction following and math reasoning, in both training free and fine tuning settings. Moreover, we show that our method mitigates attention sinks…

May 13, 2026

Paper page - SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation

…Process-Driven Image Generation via Interleaved Reasoning (2026) Affordance Agent Harness: Verification-Gated Skill Orchestration (2026) MathGen: Revealing the Illusion of Mathematical Competence through Text-to-Image Generation (2026) Enhanced Text-to…

May 11, 2026

Paper page - KL for a KL: On-Policy Distillation with Control Variate Baseline

…AI-generated summary On-Policy Distillation (OPD) has emerged as a dominant post-training paradigm for large language models, especially for reasoning domains. However, OPD remains unstable in practice due to the…

May 15, 2026

Paper page - EDU-CIRCUIT-HW: Evaluating Multimodal Large Language Models on Real-World University-Level STEM Student Handwritten Solutions

…However, accurately interpreting unconstrained STEM student handwritten solutions with intertwined mathematical formulas, diagrams, and textual reasoning poses a significant challenge due to the lack of authentic and domain-specific benchmarks. Additionally, current…

May 8, 2026

Paper page - CurveBench: A Benchmark for Exact Topological Reasoning over Nested Jordan Curves

…Amirreza Mohseni , , , Abstract CurveBench presents a benchmark for hierarchical topological reasoning using visual inputs, demonstrating significant challenges in exact topology-aware visual reasoning even with advanced models. AI-generated summary We introduce…

May 15, 2026

Followed topics

Search

Paper page - EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales

Paper page - A Single Layer to Explain Them All:Understanding Massive Activations in Large Language Models

Paper page - SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation

Paper page - KL for a KL: On-Policy Distillation with Control Variate Baseline

Paper page - EDU-CIRCUIT-HW: Evaluating Multimodal Large Language Models on Real-World University-Level STEM Student Handwritten Solutions

Paper page - CurveBench: A Benchmark for Exact Topological Reasoning over Nested Jordan Curves

Paper page - LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Paper page - PatRe: A Full-Stage Office Action and Rebuttal Generation Benchmark for Patent Examination

Paper page - AcademiClaw: When Students Set Challenges for AI Agents

Paper page - Reliable Chain-of-Thought via Prefix Consistency