Abeer Badawi
2025
Not Lost After All: How Cross-Encoder Attribution Challenges Position Bias Assumptions in LLM Summarization
Elahe Rahimi
|
Hassan Sajjad
|
Domenic Rosati
|
Abeer Badawi
|
Elham Dolatabadi
|
Frank Rudzicz
Findings of the Association for Computational Linguistics: EMNLP 2025
Position bias, the tendency of Large Language Models (LLMs) to select content based on its structural position in a document rather than its semantic relevance, has been viewed as a key limitation in automatic summarization. To measure position bias, prior studies rely heavily on n-gram matching techniques, which fail to capture semantic relationships in abstractive summaries where content is extensively rephrased. To address this limitation, we apply a cross-encoder-based alignment method that jointly processes summary-source sentence pairs, enabling more accurate identification of semantic correspondences even when summaries substantially rewrite the source. Experiments with five LLMs across six summarization datasets reveal significantly different position bias patterns than those reported by traditional metrics. Our findings suggest that these patterns primarily reflect rational adaptations to document structure and content rather than true model limitations. Through controlled experiments and analyses across varying document lengths and multi-document settings, we show that LLMs use content from all positions more effectively than previously assumed, challenging common claims about “lost-in-the-middle” behaviour.