German Also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset

Laura Mascarell; Ribin Chalumattu; Annette Rios Gonzales

German Also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset

Laura Mascarell, Ribin Chalumattu, Annette Rios

Abstract

The advent of Large Language Models (LLMs) has led to remarkable progress on a wide range of natural language processing tasks. Despite the advances, these large-sized models still suffer from hallucinating information in their output, which poses a major issue in automatic text summarization, as we must guarantee that the generated summary is consistent with the content of the source document. Previous research addresses the challenging task of detecting hallucinations in the output (i.e. inconsistency detection) in order to evaluate the faithfulness of the generated summaries. However, these works primarily focus on English and recent multilingual approaches lack German data. This work presents Absinth, a manually annotated dataset for hallucination detection in German news summarization and explores the capabilities of novel open-source LLMs on this task in both fine-tuning and in-context learning settings. We open-source and release the Absinth dataset to foster further research on hallucination detection in German.

Anthology ID:: 2024.lrec-main.680
Volume:: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:: LREC | COLING
SIG:
Publisher:: ELRA and ICCL
Note:
Pages:: 7696–7706
Language:
URL:: https://aclanthology.org/2024.lrec-main.680
DOI:
Bibkey:
Cite (ACL):: Laura Mascarell, Ribin Chalumattu, and Annette Rios. 2024. German Also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 7696–7706, Torino, Italia. ELRA and ICCL.
Cite (Informal):: German Also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset (Mascarell et al., LREC-COLING 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.lrec-main.680.pdf

PDF Cite Search