Search: timing uncertainty

Paper page - Compliance versus Sensibility: On the Reasoning Controllability in Large Language Models

…The following papers were recommended by the Semantic Scholar API Deliberative Alignment is Deep, but Uncertainty Remains: Inference time safety improvement in reasoning via attribution of unsafe behavior to base model (2026…

May 1, 2026

Paper page - Hide-and-Seek in Trajectories: Discovering Failure Signals for VLA Runtime Monitoring

…Existing failure detection methods either rely on expensive action resampling or external models, while alternatives propagate trajectory-level labels uniformly across every timestep, obscuring localized failure signals. In this paper, we propose…

Jun 1, 2026

Paper page - Quality-Guided Semi-Supervised Learning for Medical Image Segmentation

…Generated by Qwen/Qwen2.5-Coder-32B-Instruct Training accurate medical image segmentation models requires large amounts of densely annotated data, which is costly and time-consuming to obtain. Semi-supervised learning…

Jun 5, 2026

Paper page - Dystruct: Dynamically Structured Diffusion Language Model Decoding via Bayesian Inference

…At each window expansion step, the method integrates local uncertainty with structural signals via a unified mechanism that supports dynamic structured generation, including both flexible block expansion and block organization, while maintaining…

May 12, 2026

Paper page - Think, then Score: Decoupled Reasoning and Scoring for Video Reward Modeling

…AI-generated summary Recent advances in generative video models are increasingly driven by post-training and test-time scaling, both of which critically depend on the quality of video reward models (RMs…

May 8, 2026

Paper page - The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMs

…The following papers were recommended by the Semantic Scholar API Adaptive Test-Time Compute Allocation for Reasoning LLMs via Constrained Policy Optimization (2026) Uncertainty-Aware Budget Allocation for Adaptive Test-Time Reasoning…

Jun 5, 2026

Paper page - Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning

…Generated by Qwen/Qwen2.5-Coder-32B-Instruct Selecting the best response from multiple small-model samples using a stronger scorer is a simple inference-time strategy, but fails when the small…

Jun 2, 2026

Paper page - PEEK: Picking Essential frames via Efficient Knowledge distillation

…it adds only 5.2% to the captioning time, compared with 65.4% for CSTA and 211.9% for MaxInfo. We release our code and pre-trained checkpoint at https://github.com…

Jun 1, 2026

Paper page - Adaptive Teacher Exposure for Self-Distillation in LLM Reasoning

…This motivates treating teacher exposure not as a fixed hyperparameter but as a learnable training-time control variable. We therefore propose Adaptive Teacher Exposure for Self-Distillation (ATESD). ATESD models the reveal…

May 14, 2026

Paper page - Beyond Scalar Rewards by Internalizing Reasoning into Score Distributions

…Existing scalar, score-token, and pairwise reward models over-compress uncertainty and fine-grained score differences, while reasoning-based generative rewards provide stronger judgments but are costly to deploy and difficult to…

Jun 11, 2026

Followed topics