Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA

Sergey Pletenev; Maria Marina; Nikolay Ivanov; Daria Galimzianova; Nikita Krayko; Mikhail Salnikov; Vasily Konovalov; Alexander Panchenko; Viktor Moskvoretskii

doi:10.18653/v1/2025.emnlp-main.434

Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA

Sergey Pletenev, Maria Marina, Nikolay Ivanov, Daria Galimzianova, Nikita Krayko, Mikhail Salnikov, Vasily Konovalov, Alexander Panchenko, Viktor Moskvoretskii

Abstract

Large Language Models (LLMs) often hallucinate in question answering (QA) tasks. A key yet underexplored factor contributing to this is the temporality of questions – whether they are evergreen (answers remain stable over time) or mutable (answers change). In this work, we introduce EverGreenQA, the first multilingual QA dataset with evergreen labels, supporting both evaluation and training. Using EverGreenQA, we benchmark 12 modern LLMs to assess whether they encode question temporality explicitly (via verbalized judgments) or implicitly (via uncertainty signals). We also train EG-E5, a lightweight multilingual classifier that achieves SoTA performance on this task. Finally, we demonstrate the practical utility of evergreen classification across three applications: improving self-knowledge estimation, filtering QA datasets, and explaining GPT-4o’s retrieval behavior.

Anthology ID:: 2025.emnlp-main.434
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8603–8620
Language:
URL:: https://aclanthology.org/2025.emnlp-main.434/
DOI:: 10.18653/v1/2025.emnlp-main.434
Bibkey:
Cite (ACL):: Sergey Pletenev, Maria Marina, Nikolay Ivanov, Daria Galimzianova, Nikita Krayko, Mikhail Salnikov, Vasily Konovalov, Alexander Panchenko, and Viktor Moskvoretskii. 2025. Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 8603–8620, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA (Pletenev et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.434.pdf
Checklist:: 2025.emnlp-main.434.checklist.pdf

PDF Cite Search Checklist Fix data