Where am I? Large Language Models Wandering between Semantics and Structures in Long Contexts

Seonmin Koo, Jinsung Kim, YoungJoon Jang, Chanjun Park, Heuiseok Lim


Abstract
As the utilization of Large Language Models (LLMs) becomes more widespread, there is a growing demand for their ability to handle more complex and longer external knowledge across various use cases. Most existing evaluations of the open-ended question answering (ODQA) task, which necessitates the use of external knowledge, focus solely on whether the model provides the correct answer. However, even when LLMs answer correctly, they often fail to provide an obvious source for their responses. Therefore, it is necessary to jointly evaluate and verify the correctness of the answers and the appropriateness of grounded evidence in complex external contexts. To address this issue, we examine the phenomenon of discrepancies in abilities across two distinct tasks—QA and evidence selection—when performed simultaneously, from the perspective of task alignment. To verify LLMs’ task alignment, we introduce a verification framework and resources considering both semantic relevancy and structural diversity of the given long context knowledge. Through extensive experiments and detailed analysis, we provide insights into the task misalignment between QA and evidence selection. Our code and resources will be available upon acceptance.
Anthology ID:
2024.emnlp-main.783
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14144–14160
Language:
URL:
https://aclanthology.org/2024.emnlp-main.783
DOI:
10.18653/v1/2024.emnlp-main.783
Bibkey:
Cite (ACL):
Seonmin Koo, Jinsung Kim, YoungJoon Jang, Chanjun Park, and Heuiseok Lim. 2024. Where am I? Large Language Models Wandering between Semantics and Structures in Long Contexts. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 14144–14160, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Where am I? Large Language Models Wandering between Semantics and Structures in Long Contexts (Koo et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.783.pdf