Freya Hewett


2024

pdf bib
Elaborative Simplification for German-Language Texts
Freya Hewett | Hadi Asghari | Manfred Stede
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue

There are many strategies used to simplify texts. In this paper, we focus specifically on the act of inserting information or elaborative simplification. Adding information is done for various reasons, such as providing definitions for concepts, making relations between concepts more explicit, and providing background information that is a prerequisite for the main content. As all of these reasons have the main goal of ensuring coherence, we first conduct a corpus analysis of simplified German-language texts that have been annotated with Rhetorical Structure Theory (RST). We focus specifically on how additional information is incorporated into the RST annotation for a text. We then transfer these insights to automatic simplification using Large Language Models (LLMs), as elaborative simplification is a nuanced task which LLMs still seem to struggle with.

2023

pdf bib
APA-RST: A Text Simplification Corpus with RST Annotations
Freya Hewett
Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023)

We present a corpus of parallel German-language simplified newspaper articles. The articles have been aligned at sentence level and annotated according to the Rhetorical Structure Theory (RST) framework. These RST annotated texts could shed light on structural aspects of text complexity and how simplifications work on a text-level.

2022

pdf bib
HIIG at GermEval 2022: Best of Both Worlds Ensemble for Automatic Text Complexity Assessment
Hadi Asghari | Freya Hewett
Proceedings of the GermEval 2022 Workshop on Text Complexity Assessment of German Text

In this paper we explain HIIG’s contribution to the shared task Text Complexity DE Challenge 2022. Our best-performing model for the task of automatically determining the complexity level of a German-language sentence is a combination of a transformer model and a classic feature-based model, which achieves a mapped root square mean error of 0.446 on the test data.

pdf bib
Extractive Summarisation for German-language Data: A Text-level Approach with Discourse Features
Freya Hewett | Manfred Stede
Proceedings of the 29th International Conference on Computational Linguistics

We examine the link between facets of Rhetorical Structure Theory (RST) and the selection of content for extractive summarisation, for German-language texts. For this purpose, we produce a set of extractive summaries for a dataset of German-language newspaper commentaries, a corpus which already has several layers of annotation. We provide an in-depth analysis of the connection between summary sentences and several RST-based features and transfer these insights to various automated summarisation models. Our results show that RST features are informative for the task of extractive summarisation, particularly nuclearity and relations at sentence-level.

2021

pdf bib
Automatically evaluating the conceptual complexity of German texts
Freya Hewett | Manfred Stede
Proceedings of the 17th Conference on Natural Language Processing (KONVENS 2021)

2019

pdf bib
The Utility of Discourse Parsing Features for Predicting Argumentation Structure
Freya Hewett | Roshan Prakash Rane | Nina Harlacher | Manfred Stede
Proceedings of the 6th Workshop on Argument Mining

Research on argumentation mining from text has frequently discussed relationships to discourse parsing, but few empirical results are available so far. One corpus that has been annotated in parallel for argumentation structure and for discourse structure (RST, SDRT) are the ‘argumentative microtexts’ (Peldszus and Stede, 2016a). While results on perusing the gold RST annotations for predicting argumentation have been published (Peldszus and Stede, 2016b), the step to automatic discourse parsing has not yet been taken. In this paper, we run various discourse parsers (RST, PDTB) on the corpus, compare their results to the gold annotations (for RST) and then assess the contribution of automatically-derived discourse features for argumentation parsing. After reproducing the state-of-the-art Evidence Graph model from Afantenos et al. (2018) for the microtexts, we find that PDTB features can indeed improve its performance.