Insights into LLM Long-Context Failures: When Transformers Know but Don’t Tell

Muhan Gao, TaiMing Lu, Kuai Yu, Adam Byerly, Daniel Khashabi


Abstract
Large Language Models (LLMs) exhibit positional bias, struggling to utilize information from the middle or end of long contexts. Our study explores LLMs’ long-context reasoning by probing their hidden representations. We find that while LLMs encode the position of target information, they often fail to leverage this in generating accurate responses. This reveals a disconnect between information retrieval and utilization, a “know but don’t tell” phenomenon. We further analyze the relationship between extraction time and final accuracy, offering insights into the underlying mechanics of transformer models.
Anthology ID:
2024.findings-emnlp.447
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7611–7625
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.447
DOI:
Bibkey:
Cite (ACL):
Muhan Gao, TaiMing Lu, Kuai Yu, Adam Byerly, and Daniel Khashabi. 2024. Insights into LLM Long-Context Failures: When Transformers Know but Don’t Tell. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 7611–7625, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Insights into LLM Long-Context Failures: When Transformers Know but Don’t Tell (Gao et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.447.pdf