Paper page - The First Token Knows: Single-Decode Confidence for Hallucination Detection
… 0.793 semantic agreement vs. 0.791 surface-form self-consistency Across Llama-3.1-8B, Mistral-7B-v0.3, and Qwen2.5-7B on PopQA and TriviaQA n=1000 each Ensembling ϕ first with semantic agreement adds only +0.02 AUROC — first-token confidence already carries most of the signal Feedback welcome. …