Reassess Summary Factual Inconsistency Detection with Large Language Model

Jiuding Yang, Hui Liu, Weidong Guo, Zhuwei Rao, Yu Xu, Di Niu


Abstract
Ensuring factual consistency between the summary and the original document is paramount in summarization tasks. Consequently, considerable effort has been dedicated to detecting inconsistencies. With the advent of Large Language Models (LLMs), recent studies have begun to leverage their advanced language understanding capabilities for inconsistency detection. However, early attempts have shown that LLMs underperform traditional models due to their limited ability to follow instructions and the absence of an effective detection methodology. In this study, we reassess summary inconsistency detection with LLMs, comparing the performances of GPT-3.5 and GPT-4. To advance research in LLM-based inconsistency detection, we propose SIFiD (Summary Inconsistency Detection with Filtered Document) that identify key sentences within documents by either employing natural language inference or measuring semantic similarity between summaries and documents.
Anthology ID:
2024.knowllm-1.3
Volume:
Proceedings of the 1st Workshop on Towards Knowledgeable Language Models (KnowLLM 2024)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Sha Li, Manling Li, Michael JQ Zhang, Eunsol Choi, Mor Geva, Peter Hase, Heng Ji
Venues:
KnowLLM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
27–31
Language:
URL:
https://aclanthology.org/2024.knowllm-1.3
DOI:
Bibkey:
Cite (ACL):
Jiuding Yang, Hui Liu, Weidong Guo, Zhuwei Rao, Yu Xu, and Di Niu. 2024. Reassess Summary Factual Inconsistency Detection with Large Language Model. In Proceedings of the 1st Workshop on Towards Knowledgeable Language Models (KnowLLM 2024), pages 27–31, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Reassess Summary Factual Inconsistency Detection with Large Language Model (Yang et al., KnowLLM-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.knowllm-1.3.pdf