Laura Mascarell

2024

pdf bib abs
AdaptEval: Evaluating Large Language Models on Domain Adaptation for Text Summarization
Anum Afzal | Ribin Chalumattu | Florian Matthes | Laura Mascarell
Proceedings of the 1st Workshop on Customizable NLP: Progress and Challenges in Customizing NLP for a Domain, Application, Group, or Individual (CustomNLP4U)

Despite the advances in the abstractive summarization task using Large Language Models (LLM), there is a lack of research that asses their abilities to easily adapt to different domains. We evaluate the domain adaptation abilities of a wide range of LLMs on the summarization task across various domains in both fine-tuning and in-context learning settings. We also present AdaptEval, the first domain adaptation evaluation suite. AdaptEval includes a domain benchmark and a set of metrics to facilitate the analysis of domain adaptation. Our results demonstrate that LLMs exhibit comparable performance in the in-context learning setting, regardless of their parameter scale.

pdf bib abs
Which Information Matters? Dissecting Human-written Multi-document Summaries with Partial Information Decomposition
Laura Mascarell | Yan LHomme | Majed El Helou
Findings of the Association for Computational Linguistics: ACL 2024

Understanding the nature of high-quality summaries is crucial to further improve the performance of multi-document summarization. We propose an approach to characterize human-written summaries using partial information decomposition, which decomposes the mutual information provided by all source documents into union, redundancy, synergy, and unique information. Our empirical analysis on different MDS datasets shows that there is a direct dependency between the number of sources and their contribution to the summary.

pdf bib abs
German Also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset
Laura Mascarell | Ribin Chalumattu | Annette Rios
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The advent of Large Language Models (LLMs) has led to remarkable progress on a wide range of natural language processing tasks. Despite the advances, these large-sized models still suffer from hallucinating information in their output, which poses a major issue in automatic text summarization, as we must guarantee that the generated summary is consistent with the content of the source document. Previous research addresses the challenging task of detecting hallucinations in the output (i.e. inconsistency detection) in order to evaluate the faithfulness of the generated summaries. However, these works primarily focus on English and recent multilingual approaches lack German data. This work presents Absinth, a manually annotated dataset for hallucination detection in German news summarization and explores the capabilities of novel open-source LLMs on this task in both fine-tuning and in-context learning settings. We open-source and release the Absinth dataset to foster further research on hallucination detection in German.

2023

pdf bib abs
OpenTIPE: An Open-source Translation Framework for Interactive Post-Editing Research
Fabian Landwehr | Thomas Steinmann | Laura Mascarell
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

Despite the latest improvements on machine translation, professional translators still must review and post-edit the automatic output to ensure high-quality translations. The research on automating this process lacks an interactive post-editing environment implemented for this purpose; therefore, current approaches do not consider the human interactions that occur in real post-editing scenarios. To address this issue, we present OpenTIPE, a flexible and extensible framework that aims at supporting research on interactive post-editing. Specifically, the interactive environment of OpenTIPE allows researchers to explore human-centered approaches for the post-editing task. We release the OpenTIPE source code and showcase its main functionalities with a demonstration video and an online live demo.

pdf bib abs
Entropy-based Sampling for Abstractive Multi-document Summarization in Low-resource Settings
Laura Mascarell | Ribin Chalumattu | Julien Heitmann
Proceedings of the 16th International Natural Language Generation Conference

Research in Multi-document Summarization (MDS) mostly focuses on the English language and depends on large MDS datasets that are not available for other languages. Some of these approaches concatenate the source documents, resulting in overlong model inputs. Existing transformer architectures are unable to process such long inputs entirely, omitting documents in the summarization process. Other solutions address this issue by implementing multi-stage approaches that also require changes in the model architecture. In this paper, we introduce various sampling approaches based on information entropy that allow us to perform MDS in a single stage. These approaches also consider all source documents without using MDS training data nor changing the model’s architecture. Besides, we build a MDS test set of German news articles to assess the performance of our methods on abstractive multi-document summaries. Experimental results show that our entropy-based approaches outperform previous state-of-the-art on German MDS, while still remaining primarily abstractive. We release our code and MDS test set to encourage further research in German abstractive MDS.

2021

The widespread use of the Internet and the rapid dissemination of information poses the challenge of identifying the veracity of its content. Stance detection, which is the task of predicting the position of a text in regard to a specific target (e.g. claim or debate question), has been used to determine the veracity of information in tasks such as rumor classification and fake news detection. While most of the work and available datasets for stance detection address short texts snippets extracted from textual dialogues, social media platforms, or news headlines with a strong focus on the English language, there is a lack of resources targeting long texts in other languages. Our contribution in this paper is twofold. First, we present a German dataset of debate questions and news articles that is manually annotated for stance and emotion detection. Second, we leverage the dataset to tackle the supervised task of classifying the stance of a news article with regards to a debate question and provide baseline models as a reference for future work on stance detection in German news articles.

2017

pdf bib abs
Consistent Translation of Repeated Nouns using Syntactic and Semantic Cues
Xiao Pu | Laura Mascarell | Andrei Popescu-Belis
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

We propose a method to decide whether two occurrences of the same noun in a source text should be translated consistently, i.e. using the same noun in the target text as well. We train and test classifiers that predict consistent translations based on lexical, syntactic, and semantic features. We first evaluate the accuracy of our classifiers intrinsically, in terms of the accuracy of consistency predictions, over a subset of the UN Corpus. Then, we also evaluate them in combination with phrase-based statistical MT systems for Chinese-to-English and German-to-English. We compare the automatic post-editing of noun translations with the re-ranking of the translation hypotheses based on the classifiers’ output, and also use these methods in combination. This improves over the baseline and closes up to 50% of the gap in BLEU scores between the baseline and an oracle classifier.

pdf bib
Improving Word Sense Disambiguation in Neural Machine Translation with Sense Embeddings
Annette Rios Gonzales | Laura Mascarell | Rico Sennrich
Proceedings of the Second Conference on Machine Translation

pdf bib abs
Lexical Chains meet Word Embeddings in Document-level Statistical Machine Translation
Laura Mascarell
Proceedings of the Third Workshop on Discourse in Machine Translation

Currently under review for EMNLP 2017 The phrase-based Statistical Machine Translation (SMT) approach deals with sentences in isolation, making it difficult to consider discourse context in translation. This poses a challenge for ambiguous words that need discourse knowledge to be correctly translated. We propose a method that benefits from the semantic similarity in lexical chains to improve SMT output by integrating it in a document-level decoder. We focus on word embeddings to deal with the lexical chains, contrary to the traditional approach that uses lexical resources. Experimental results on German-to-English show that our method produces correct translations in up to 88% of the changes, improving the translation in 36%-48% of them over the baseline.