Top-n𝜎: Eliminating Noise in Logit Space for Robust Token Sampling of LLM

Chenxia Tang; Jianchun Liu; Hongli Xu; Liusheng Huang

doi:10.18653/v1/2025.acl-long.528

Top-n𝜎: Eliminating Noise in Logit Space for Robust Token Sampling of LLM

Chenxia Tang, Jianchun Liu, Hongli Xu, Liusheng Huang

Abstract

Large language models (LLMs) rely heavily on sampling methods to generate diverse and high-quality text.While existing sampling methods like top-p and min-p have identified the detrimental effects of low-probability tails in LLMs’ outputs, they still fail to effectively distinguish between diversity and noise. This limitation stems from their reliance on probability-based metrics that are inherently sensitive to temperature scaling. Through empirical and theoretical analysis, we make two key discoveries: (1) the pre-softmax logits exhibit a clear statistical separation between informative tokens and noise, and (2) we prove the mathematical equivalence of min-p and top-(1-p) under uniform distribution over logits. These findings motivate the design of top-n𝜎, a novel sampling method that identifies informative tokens by eliminating noise directly in logit space.Unlike existing methods that become unstable at high temperatures, top-n𝜎 achieves temperature-invariant token selection while preserving output diversity. Extensive experiments across reasoning and creative writing tasks demonstrate that our method consistently outperforms existing approaches, with particularly significant improvements in high-temperature settings.

Anthology ID:: 2025.acl-long.528
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10758–10774
Language:
URL:: https://aclanthology.org/2025.acl-long.528/
DOI:: 10.18653/v1/2025.acl-long.528
Bibkey:
Cite (ACL):: Chenxia Tang, Jianchun Liu, Hongli Xu, and Liusheng Huang. 2025. Top-n𝜎: Eliminating Noise in Logit Space for Robust Token Sampling of LLM. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10758–10774, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Top-n𝜎: Eliminating Noise in Logit Space for Robust Token Sampling of LLM (Tang et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.528.pdf

PDF Cite Search Fix data