Error Analysis of using BART for Multi-Document Summarization: A Study for English and German Language

Timo Johner, Abhik Jana, Chris Biemann


Abstract
Recent research using pre-trained language models for multi-document summarization task lacks deep investigation of potential erroneous cases and their possible application on other languages. In this work, we apply a pre-trained language model (BART) for multi-document summarization (MDS) task using both fine-tuning and without fine-tuning. We use two English datasets and one German dataset for this study. First, we reproduce the multi-document summaries for English language by following one of the recent studies. Next, we show the applicability of the model to German language by achieving state-of-the-art performance on German MDS. We perform an in-depth error analysis of the followed approach for both languages, which leads us to identifying most notable errors, from made-up facts and topic delimitation, and quantifying the amount of extractiveness.
Anthology ID:
2021.nodalida-main.43
Volume:
Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:
May 31--2 June
Year:
2021
Address:
Reykjavik, Iceland (Online)
Editors:
Simon Dobnik, Lilja Øvrelid
Venue:
NoDaLiDa
SIG:
Publisher:
Linköping University Electronic Press, Sweden
Note:
Pages:
391–397
Language:
URL:
https://aclanthology.org/2021.nodalida-main.43
DOI:
Bibkey:
Cite (ACL):
Timo Johner, Abhik Jana, and Chris Biemann. 2021. Error Analysis of using BART for Multi-Document Summarization: A Study for English and German Language. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), pages 391–397, Reykjavik, Iceland (Online). Linköping University Electronic Press, Sweden.
Cite (Informal):
Error Analysis of using BART for Multi-Document Summarization: A Study for English and German Language (Johner et al., NoDaLiDa 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.nodalida-main.43.pdf
Code
 uhh-lt/multi-summ-german
Data
CNN/Daily Mail