TFDP: Token-Efficient Disparity Audits for Autoregressive LLMs via Single-Token Masked Evaluation

Inderjeet Singh; Ramya Srinivasan; Roman Vainshtein; Hisashi Kojima

doi:10.18653/v1/2025.emnlp-main.1250

TFDP: Token-Efficient Disparity Audits for Autoregressive LLMs via Single-Token Masked Evaluation

Inderjeet Singh, Ramya Srinivasan, Roman Vainshtein, Hisashi Kojima

Abstract

Auditing autoregressive Large Language Models (LLMs) for disparities is often impeded by high token costs and limited precision. We introduce Token-Focused Disparity Probing (TFDP), a novel methodology overcoming these challenges by adapting single-token masked prediction to autoregressive architectures via targeted token querying. Disparities between minimally contrastive sentence pairs are quantified through a multi-scale semantic alignment score that integrates sentence, local-context, and token embeddings with adaptive weighting. We propose three disparity metrics: Preference Score (\mathcal{PS}), Prediction Set Divergence (\mathcal{PSD}), and Weighted Final Score (\mathcal{WFS}), for comprehensive assessment. Evaluated on our customized Proverbs Disparity Dataset (PDD) with controlled attribute toggles (e.g., gender bias, misinformation susceptibility), TFDP precisely detects disparities while achieving up to 42 times fewer output tokens than minimal n-token continuations, offering a scalable tool for responsible LLM evaluation.

Anthology ID:: 2025.emnlp-main.1250
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 24598–24615
Language:
URL:: https://aclanthology.org/2025.emnlp-main.1250/
DOI:: 10.18653/v1/2025.emnlp-main.1250
Bibkey:
Cite (ACL):: Inderjeet Singh, Ramya Srinivasan, Roman Vainshtein, and Hisashi Kojima. 2025. TFDP: Token-Efficient Disparity Audits for Autoregressive LLMs via Single-Token Masked Evaluation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 24598–24615, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: TFDP: Token-Efficient Disparity Audits for Autoregressive LLMs via Single-Token Masked Evaluation (Singh et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.1250.pdf
Checklist:: 2025.emnlp-main.1250.checklist.pdf

PDF Cite Search Checklist Fix data