When Claims Evolve: Evaluating and Enhancing the Robustness of Embedding Models Against Misinformation Edits

Jabez Magomere; Emanuele La Malfa; Manuel Tonneau; Ashkan Kazemi; Scott A. Hale

doi:10.18653/v1/2025.findings-acl.1150

When Claims Evolve: Evaluating and Enhancing the Robustness of Embedding Models Against Misinformation Edits

Jabez Magomere, Emanuele La Malfa, Manuel Tonneau, Ashkan Kazemi, Scott A. Hale

Abstract

Online misinformation remains a critical challenge, and fact-checkers increasingly rely on claim matching systems that use sentence embedding models to retrieve relevant fact-checks. However, as users interact with claims online, they often introduce edits, and it remains unclear whether current embedding models used in retrieval are robust to such edits. To investigate this, we introduce a perturbation framework that generates valid and natural claim variations, enabling us to assess the robustness of a wide-range of sentence embedding models in a multi-stage retrieval pipeline and evaluate the effectiveness of various mitigation approaches. Our evaluation reveals that standard embedding models exhibit notable performance drops on edited claims, while LLM-distilled embedding models offer improved robustness at a higher computational cost. Although a strong reranker helps to reduce the performance drop, it cannot fully compensate for first-stage retrieval gaps. To address these retrieval gaps, we evaluate train- and inference-time mitigation approaches, demonstrating that they can improve in-domain robustness by up to 17 percentage points and boost out-of-domain generalization by 10 percentage points. Overall, our findings provide practical improvements to claim-matching systems, enabling more reliable fact-checking of evolving misinformation.

Anthology ID:: 2025.findings-acl.1150
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 22374–22404
Language:
URL:: https://aclanthology.org/2025.findings-acl.1150/
DOI:: 10.18653/v1/2025.findings-acl.1150
Bibkey:
Cite (ACL):: Jabez Magomere, Emanuele La Malfa, Manuel Tonneau, Ashkan Kazemi, and Scott A. Hale. 2025. When Claims Evolve: Evaluating and Enhancing the Robustness of Embedding Models Against Misinformation Edits. In Findings of the Association for Computational Linguistics: ACL 2025, pages 22374–22404, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: When Claims Evolve: Evaluating and Enhancing the Robustness of Embedding Models Against Misinformation Edits (Magomere et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.1150.pdf

PDF Cite Search Fix data