Freya Hewett

2025

Disagreements in analyses of rhetorical text structure: A new dataset and first analyses
Freya Hewett | Manfred Stede
Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025)

Discourse structure annotation is known to involve a high level of subjectivity, which often results in low inter-annotator agreement. In this paper, we focus on “legitimate disagreements”, by which we refer to multiple valid annotations for a text or text segment. We provide a new dataset of English and German texts, where each text comes with two parallel analyses (both done by well-trained annotators) in the framework of Rhetorical Structure Theory. Using the RST Tace tool, we build a list of all conflicting annotation decisions and present some statistics for the corpus. Thereafter, we undertake a qualitative analysis of the disagreements and propose a typology of underlying reasons. From this we derive the need to differentiate two kinds of ambiguities in RST annotation: those that result from inherent “everyday” linguistic ambiguity, and those that arise from specifications in the theory and/or the annotation schemes.

2024

pdf bib abs

Elaborative Simplification for German-Language Texts
Freya Hewett | Hadi Asghari | Manfred Stede
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue

There are many strategies used to simplify texts. In this paper, we focus specifically on the act of inserting information or elaborative simplification. Adding information is done for various reasons, such as providing definitions for concepts, making relations between concepts more explicit, and providing background information that is a prerequisite for the main content. As all of these reasons have the main goal of ensuring coherence, we first conduct a corpus analysis of simplified German-language texts that have been annotated with Rhetorical Structure Theory (RST). We focus specifically on how additional information is incorporated into the RST annotation for a text. We then transfer these insights to automatic simplification using Large Language Models (LLMs), as elaborative simplification is a nuanced task which LLMs still seem to struggle with.

2023

pdf bib abs

APA-RST: A Text Simplification Corpus with RST Annotations
Freya Hewett
Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023)

We present a corpus of parallel German-language simplified newspaper articles. The articles have been aligned at sentence level and annotated according to the Rhetorical Structure Theory (RST) framework. These RST annotated texts could shed light on structural aspects of text complexity and how simplifications work on a text-level.

2022

pdf bib abs

Extractive Summarisation for German-language Data: A Text-level Approach with Discourse Features
Freya Hewett | Manfred Stede
Proceedings of the 29th International Conference on Computational Linguistics

We examine the link between facets of Rhetorical Structure Theory (RST) and the selection of content for extractive summarisation, for German-language texts. For this purpose, we produce a set of extractive summaries for a dataset of German-language newspaper commentaries, a corpus which already has several layers of annotation. We provide an in-depth analysis of the connection between summary sentences and several RST-based features and transfer these insights to various automated summarisation models. Our results show that RST features are informative for the task of extractive summarisation, particularly nuclearity and relations at sentence-level.

pdf bib abs

HIIG at GermEval 2022: Best of Both Worlds Ensemble for Automatic Text Complexity Assessment
Hadi Asghari | Freya Hewett
Proceedings of the GermEval 2022 Workshop on Text Complexity Assessment of German Text

In this paper we explain HIIG’s contribution to the shared task Text Complexity DE Challenge 2022. Our best-performing model for the task of automatically determining the complexity level of a German-language sentence is a combination of a transformer model and a classic feature-based model, which achieves a mapped root square mean error of 0.446 on the test data.

2021

pdf bib

Automatically evaluating the conceptual complexity of German texts
Freya Hewett | Manfred Stede
Proceedings of the 17th Conference on Natural Language Processing (KONVENS 2021)

2019

pdf bib abs

The Utility of Discourse Parsing Features for Predicting Argumentation Structure
Freya Hewett | Roshan Prakash Rane | Nina Harlacher | Manfred Stede
Proceedings of the 6th Workshop on Argument Mining

Research on argumentation mining from text has frequently discussed relationships to discourse parsing, but few empirical results are available so far. One corpus that has been annotated in parallel for argumentation structure and for discourse structure (RST, SDRT) are the ‘argumentative microtexts’ (Peldszus and Stede, 2016a). While results on perusing the gold RST annotations for predicting argumentation have been published (Peldszus and Stede, 2016b), the step to automatic discourse parsing has not yet been taken. In this paper, we run various discourse parsers (RST, PDTB) on the corpus, compare their results to the gold annotations (for RST) and then assess the contribution of automatically-derived discourse features for argumentation parsing. After reproducing the state-of-the-art Evidence Graph model from Afantenos et al. (2018) for the microtexts, we find that PDTB features can indeed improve its performance.

Co-authors

Venues

LAW1

SIGDIAL1

WS1

Fix author