Improving Faithfulness by Augmenting Negative Summaries from Fake Documents

Tianshu Wang, Faisal Ladhak, Esin Durmus, He He


Abstract
Current abstractive summarization systems tend to hallucinate content that is unfaithful to the source document, posing a risk of misinformation. To mitigate hallucination, we must teach the model to distinguish hallucinated summaries from faithful ones. However, the commonly used maximum likelihood training does not disentangle factual errors from other model errors. To address this issue,we propose a back-translation-style approach to augment negative samples that mimic factual errors made by the model. Specifically, we train an elaboration model that generates hallucinated documents given the reference summaries, and then generates negative summaries from the fake documents. We incorporate the negative samples into training through a controlled generator, which produces faithful/unfaithful summaries conditioned on the control codes. Additionally, we find that adding textual entailment data through multitasking further boosts the performance. Experiments on three datasets (XSum, Gigaword, and WikiHow) show that our method consistently improves faithfulness without sacrificing informativeness according to both human and automatic evaluation
Anthology ID:
2022.emnlp-main.816
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11913–11921
Language:
URL:
https://aclanthology.org/2022.emnlp-main.816
DOI:
10.18653/v1/2022.emnlp-main.816
Bibkey:
Cite (ACL):
Tianshu Wang, Faisal Ladhak, Esin Durmus, and He He. 2022. Improving Faithfulness by Augmenting Negative Summaries from Fake Documents. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11913–11921, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Improving Faithfulness by Augmenting Negative Summaries from Fake Documents (Wang et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.816.pdf