Search: Signal

Paper page - FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning

…This training-time attention dilution (the starvation of content tokens in the attention distribution) weakens the gradient signal, limiting the model's ability to learn robust long-context capabilities. We introduce FocuSFT…

May 13, 2026

Paper page - On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

…Existing safety-alignment signals are largely response-level or off-policy, and often incur a safety-utility trade-off: improving agent safety comes at the cost of degraded task performance . Such sparse…

Paper page - Large Language Models Explore by Latent Distilling

…The prediction error provides a novelty signal: familiar semantic trajectories become easier to predict, while under-explored directions produce higher error. We then use this signal to guide sampling toward less redundant…

Apr 29, 2026

Paper page - MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image

…We focus on predictive tasks where the modalities provide complementary predictive signal , and where generic embeddings lose critical information, necessitating Target-Aware Representations that are aligned with the task. Our experimental results…

May 14, 2026

Paper page - Dystruct: Dynamically Structured Diffusion Language Model Decoding via Bayesian Inference

…some require costly retraining to accommodate variable-length outputs, while others depend solely on local confidence signals during decoding. Such local criteria fail to capture the evolving structure of the sequence, often…

May 12, 2026

Paper page - Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning

…Generate proposes candidate trajectories and topologies; Filter constructs intermediate signals via verifiers , judges , critics ; Control allocates compute and makes continuation/branching/stopping decisions under budgets; and Replay retains and reuses artifacts across…

May 6, 2026

Paper page - LEAD: Length-Efficient Adaptive and Dynamic Reasoning for Large Language Models

…LEAD dynamically calibrates the correctness-efficiency trade-off at each step using a Potential-Scaled Instability , directing optimization capacity to the most informative learning signal. Furthermore, it estimates an adaptive per-problem…

May 14, 2026

Paper page - Reinforcing Multimodal Reasoning Against Visual Degradation

…regularization, an auxiliary policy gradient loss anchored to clean-image advantages preserves a reliable reward signal; and to avoid systematically incorrect invariance, correctness-conditioned regularization restricts enforcement to successful trajectories. On Qwen3…

May 12, 2026

Paper page - Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts

…Our theory-grounded analysis shows that patch-wise modeling yields provably larger discrepancies when localized forensic signals are present in generated images, enabling more reliable separation from real images. Extensive experiments demonstrate…

May 13, 2026

Paper page - DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes

…This yields a richer and more diverse learning signal , improving exploration efficiency from imperfect model behavior. As a result, DenoiseRL improves reasoning performance and overall training efficiency while reducing the need for…

May 28, 2026

Followed topics

Signal