Huijun Tang
2026
What Tokens Truly Matter? The Logit Conflation Problem in LLM Sampling
Pinlong Zhao | Huijun Tang | Pengfei Jiao | Mengyang Li
Findings of the Association for Computational Linguistics: ACL 2026
Pinlong Zhao | Huijun Tang | Pengfei Jiao | Mengyang Li
Findings of the Association for Computational Linguistics: ACL 2026
Sampling methods for large language models select candidate tokens based on logit statistics, implicitly assuming that high logits indicate desirable outputs. We identify the Logit Conflation Problem, where a token’s logit aggregates prompt-independent factors, including linguistic fluency and parametric associations, with prompt-relevance. However, only prompt-relevance determines instruction-following quality. We propose SEAL-Sampling (Signal Extraction for Active ReLevance) to isolate this component through attention-weighted attribution. Our framework defines prompt-relevance as the causal effect of prompt content on token logits and establishes attention patterns as an efficient proxy. Experiments on LLaMA-3 demonstrate significant improvements over top-nσ, with gains of 1.8% on AlpacaEval 2.0 and 2.2% on IFEval. Furthermore, attribution scores correlate weakly with raw logits, confirming the extraction of an orthogonal signal. The method is training-free and introduces minimal latency, adding less than 12ms overhead per token.