RetrieverGuard: Empowering Information Retrieval to Combat LLM-Generated Misinformation

Chuwen Chen; Shuai Zhang

doi:10.18653/v1/2025.findings-naacl.249

RetrieverGuard: Empowering Information Retrieval to Combat LLM-Generated Misinformation

Abstract

Large language models (LLMs) have demonstrated impressive capabilities in generating human-like text and have been shown to store factual knowledge within their extensive parameters. However, models like ChatGPT can still actively or passively generate false or misleading information, increasing the challenge of distinguishing between human-created and machine-generated content. This poses significant risks to the authenticity and reliability of digital communication. This work aims to enhance retrieval models’ ability to identify the authenticity of texts generated by large language models, with the goal of improving the truthfulness of retrieved texts and reducing the harm of false information in the era of large models. Our contributions include: (1) we construct a diverse dataset of authentic human-authored texts and highly deceptive AI-generated texts from various domains; (2) we propose a self-supervised training method, RetrieverGuard, that enables the model to capture textual rules and styles of false information from the corpus without human-labelled data, achieving higher accuracy and robustness in identifying misleading and highly deceptive AI-generated content.

Anthology ID:: 2025.findings-naacl.249
Volume:: Findings of the Association for Computational Linguistics: NAACL 2025
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4399–4411
Language:
URL:: https://aclanthology.org/2025.findings-naacl.249/
DOI:: 10.18653/v1/2025.findings-naacl.249
Bibkey:
Cite (ACL):: Chuwen Chen and Shuai Zhang. 2025. RetrieverGuard: Empowering Information Retrieval to Combat LLM-Generated Misinformation. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 4399–4411, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: RetrieverGuard: Empowering Information Retrieval to Combat LLM-Generated Misinformation (Chen & Zhang, Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-naacl.249.pdf

PDF Cite Search Fix data