Christina Niklaus

2025

ARTIST: A Learning Support System for Fostering Students’ Argumentative Writing Skills
Thomas Huber | Christina Niklaus
Proceedings of the 18th International Natural Language Generation Conference: System Demonstrations

pdf bib abs

BioMistral-Clinical: A Scalable Approach to Clinical LLMs via Incremental Learning and RAG
Ziwei Chen | Bernhard Bermeitinger | Christina Niklaus
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

The integration of large language models (LLMs) into clinical medicine represents a major advancement in natural language processing (NLP). We introduce BioMistral-Clinical 7B, a clinical LLM built on BioMistral-7B (Labrak et al., 2024), designed to support continual learning from unstructured clinical notes for real-world tasks such as clinical decision support. Using the augmented-clinical-notes dataset provided by Hugging Face (2024), we apply prompt engineering to transform unstructured text into structured JSON, capturing key clinical information (symptoms, diagnoses, treatments, outcomes). This enables efficient incremental training via self-supervised continual learning (SPeCiaL) (Caccia and Pineau, 2021). Evaluation on MedQA (Jin et al., 2021) and MedMCQA (Pal et al., 2022) shows that BioMistral-Clinical 7B improves accuracy on MedMCQA by nearly 10 points (37.4% vs. 28.0%) over the base model, while maintaining comparable performance on MedQA (34.8% vs. 36.5%). Building on this, we propose the BioMistral-Clinical System, which integrates Retrieval-Augmented Generation (RAG) (Lewis et al., 2020) to enrich responses with relevant clinical cases retrieved from a structured vector database. The full system enhances clinical reasoning by combining domain-specific adaptation with contextual retrieval.

pdf bib abs

LLMs meet Bloom’s Taxonomy: A Cognitive View on Large Language Model Evaluations
Thomas Huber | Christina Niklaus
Proceedings of the 31st International Conference on Computational Linguistics

Current evaluation approaches for Large Language Models (LLMs) lack a structured approach that reflects the underlying cognitive abilities required for solving the tasks. This hinders a thorough understanding of the current level of LLM capabilities. For instance, it is widely accepted that LLMs perform well in terms of grammar, but it is unclear in what specific cognitive areas they excel or struggle in. This paper introduces a novel perspective on the evaluation of LLMs that leverages a hierarchical classification of tasks. Specifically, we explore the most widely used benchmarks for LLMs to systematically identify how well these existing evaluation methods cover the levels of Bloom’s Taxonomy, a hierarchical framework for categorizing cognitive skills. This comprehensive analysis allows us to identify strengths and weaknesses in current LLM assessment strategies in terms of cognitive abilities and suggest directions for both future benchmark development as well as highlight potential avenues for LLM research. Our findings reveal that LLMs generally perform better on the lower end of Bloom’s Taxonomy. Additionally, we find that there are significant gaps in the coverage of cognitive skills in the most commonly used benchmarks.

pdf bib abs

CLEAR: A Comprehensive Linguistic Evaluation of Argument Rewriting by Large Language Models
Thomas Huber | Christina Niklaus
Findings of the Association for Computational Linguistics: EMNLP 2025

While LLMs have been extensively studied on general text generation tasks, there is less research on text rewriting, a task related to general text generation, and particularly on the behavior of models on this task. In this paper we analyze what changes LLMs make in a text rewriting setting. We focus specifically on argumentative texts and their improvement, a task named Argument Improvement (ArgImp). We present CLEAR: an evaluation pipeline consisting of 57 metrics mapped to four linguistic levels: lexical, syntactic, semantic and pragmatic. This pipeline is used to examine the qualities of LLM-rewritten arguments on a broad set of argumentation corpora and compare the behavior of different LLMs on this task and analyze the behavior of different LLMs on this task in terms of linguistic levels. By taking all four linguistic levels into consideration, we find that the models perform ArgImp by shortening the texts while simultaneously increasing average word length and merging sentences. Overall we note an increase in the persuasion and coherence dimensions.

2024

pdf bib abs

Let’s discuss! Quality Dimensions and Annotated Datasets for Computational Argument Quality Assessment
Rositsa V Ivanova | Thomas Huber | Christina Niklaus
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Research in the computational assessment of Argumentation Quality has gained popularity over the last ten years. Various quality dimensions have been explored through the creation of domain-specific datasets and assessment methods. We survey the related literature (211 publications and 32 datasets), while addressing potential overlaps and blurry boundaries to related domains. This paper provides a representative overview of the state of the art in Computational Argument Quality Assessment with a focus on quality dimensions and annotated datasets. The aim of the survey is to identify research gaps and to aid future discussions and work in the domain.

2023

pdf bib abs

When Truth Matters - Addressing Pragmatic Categories in Natural Language Inference (NLI) by Large Language Models (LLMs)
Reto Gubelmann | Aikaterini-lida Kalouli | Christina Niklaus | Siegfried Handschuh
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

In this paper, we focus on the ability of large language models (LLMs) to accommodate different pragmatic sentence types, such as questions, commands, as well as sentence fragments for natural language inference (NLI). On the commonly used notion of logical inference, nothing can be inferred from a question, an order, or an incomprehensible sentence fragment. We find MNLI, arguably the most important NLI dataset, and hence models fine-tuned on this dataset, insensitive to this fact. Using a symbolic semantic parser, we develop and make publicly available, fine-tuning datasets designed specifically to address this issue, with promising results. We also make a first exploration of ChatGPT’s concept of entailment.

pdf bib abs

Enhancing Educational Dialogues: A Reinforcement Learning Approach for Generating AI Teacher Responses
Thomas Huber | Christina Niklaus | Siegfried Handschuh
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

Reinforcement Learning remains an underutilized method of training and fine-tuning Language Models (LMs) despite recent successes. This paper presents a simple approach of fine-tuning a language model with Reinforcement Learning to achieve competitive performance on the BEA 2023 Shared Task whose goal is to automatically generate teacher responses in educational dialogues. We utilized the novel NLPO algorithm that masks out tokens during generation to direct the model towards generations that maximize a reward function. We show results for both the t5-base model with 220 million parameters from the HuggingFace repository submitted to the leaderboard that, despite its comparatively small size, has achieved a good performance on both test and dev set, as well as GPT-2 with 124 million parameters. The presented results show that despite maximizing only one of the metrics used in the evaluation as a reward function our model scores highly in the other metrics as well.

2022

pdf bib abs

Shallow Discourse Parsing for Open Information Extraction and Text Simplification
Christina Niklaus | André Freitas | Siegfried Handschuh
Proceedings of the 3rd Workshop on Computational Approaches to Discourse

We present a discourse-aware text simplification (TS) approach that recursively splits and rephrases complex English sentences into a semantic hierarchy of simplified sentences. Using a set of linguistically principled transformation patterns, sentences are converted into a hierarchical representation in the form of core sentences and accompanying contexts that are linked via rhetorical relations. As opposed to previously proposed sentence splitting approaches, which commonly do not take into account discourse-level aspects, our TS approach preserves the semantic relationship of the decomposed constituents in the output. A comparative analysis with the annotations contained in RST-DT shows that we capture the contextual hierarchy between the split sentences with a precision of 89% and reach an average precision of 69% for the classification of the rhetorical relations that hold between them. Moreover, an integration into state-of-the-art Open Information Extraction (IE) systems reveals that when applying our TS approach as a pre-processing step, the generated relational tuples are enriched with additional meta information, resulting in a novel lightweight semantic representation for the task of Open IE.

pdf bib abs

Modeling Persuasive Discourse to Adaptively Support Students’ Argumentative Writing
Thiemo Wambsganss | Christina Niklaus
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We introduce an argumentation annotation approach to model the structure of argumentative discourse in student-written business model pitches. Additionally, the annotation scheme captures a series of persuasiveness scores such as the specificity, strength, evidence, and relevance of the pitch and the individual components. Based on this scheme, we annotated a corpus of 200 business model pitches in German. Moreover, we trained predictive models to detect argumentative discourse structures and embedded them in an adaptive writing support system for students that provides them with individual argumentation feedback independent of an instructor, time, and location. We evaluated our tool in a real-world writing exercise and found promising results for the measured self-efficacy and perceived ease-of-use. Finally, we present our freely available corpus of persuasive business model pitches with 3,207 annotated sentences in German language and our annotation guidelines.

pdf bib

A Philosophically-Informed Contribution to the Generalization Problem of Neural Natural Language Inference: Shallow Heuristics, Bias, and the Varieties of Inference
Reto Gubelmann | Christina Niklaus | Siegfried Handschuh
Proceedings of the 3rd Natural Logic Meets Machine Learning Workshop (NALOMA III)

2021

pdf bib abs

Supporting Cognitive and Emotional Empathic Writing of Students
Thiemo Wambsganss | Christina Niklaus | Matthias Söllner | Siegfried Handschuh | Jan Marco Leimeister
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

We present an annotation approach to capturing emotional and cognitive empathy in student-written peer reviews on business models in German. We propose an annotation scheme that allows us to model emotional and cognitive empathy scores based on three types of review components. Also, we conducted an annotation study with three annotators based on 92 student essays to evaluate our annotation scheme. The obtained inter-rater agreement of α=0.79 for the components and the multi-π=0.41 for the empathy scores indicate that the proposed annotation scheme successfully guides annotators to a substantial to moderate agreement. Moreover, we trained predictive models to detect the annotated empathy structures and embedded them in an adaptive writing support system for students to receive individual empathy feedback independent of an instructor, time, and location. We evaluated our tool in a peer learning exercise with 58 students and found promising results for perceived empathy skill learning, perceived feedback accuracy, and intention to use. Finally, we present our freely available corpus of 500 empathy-annotated, student-written peer reviews on business models and our annotation guidelines to encourage future research on the design and development of empathy support systems.

2020

pdf bib abs

A Corpus for Argumentative Writing Support in German
Thiemo Wambsganss | Christina Niklaus | Matthias Söllner | Siegfried Handschuh | Jan Marco Leimeister
Proceedings of the 28th International Conference on Computational Linguistics

In this paper, we present a novel annotation approach to capture claims and premises of arguments and their relations in student-written persuasive peer reviews on business models in German language. We propose an annotation scheme based on annotation guidelines that allows to model claims and premises as well as support and attack relations for capturing the structure of argumentative discourse in student-written peer reviews. We conduct an annotation study with three annotators on 50 persuasive essays to evaluate our annotation scheme. The obtained inter-rater agreement of α = 0.57 for argument components and α = 0.49 for argumentative relations indicates that the proposed annotation scheme successfully guides annotators to moderate agreement. Finally, we present our freely available corpus of 1,000 persuasive student-written peer reviews on business models and our annotation guidelines to encourage future research on the design and development of argumentative writing support systems for students.

2019

pdf bib abs

MinWikiSplit: A Sentence Splitting Corpus with Minimal Propositions
Christina Niklaus | André Freitas | Siegfried Handschuh
Proceedings of the 12th International Conference on Natural Language Generation

We compiled a new sentence splitting corpus that is composed of 203K pairs of aligned complex source and simplified target sentences. Contrary to previously proposed text simplification corpora, which contain only a small number of split examples, we present a dataset where each input sentence is broken down into a set of minimal propositions, i.e. a sequence of sound, self-contained utterances with each of them presenting a minimal semantic unit that cannot be further decomposed into meaningful propositions. This corpus is useful for developing sentence splitting approaches that learn how to transform sentences with a complex linguistic structure into a fine-grained representation of short sentences that present a simple and more regular structure which is easier to process for downstream applications and thus facilitates and improves their performance.

pdf bib abs

Transforming Complex Sentences into a Semantic Hierarchy
Christina Niklaus | Matthias Cetto | André Freitas | Siegfried Handschuh
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We present an approach for recursively splitting and rephrasing complex English sentences into a novel semantic hierarchy of simplified sentences, with each of them presenting a more regular structure that may facilitate a wide variety of artificial intelligence tasks, such as machine translation (MT) or information extraction (IE). Using a set of hand-crafted transformation rules, input sentences are recursively transformed into a two-layered hierarchical representation in the form of core sentences and accompanying contexts that are linked via rhetorical relations. In this way, the semantic relationship of the decomposed constituents is preserved in the output, maintaining its interpretability for downstream applications. Both a thorough manual analysis and automatic evaluation across three datasets from two different domains demonstrate that the proposed syntactic simplification approach outperforms the state of the art in structural text simplification. Moreover, an extrinsic evaluation shows that when applying our framework as a preprocessing step the performance of state-of-the-art Open IE systems can be improved by up to 346% in precision and 52% in recall. To enable reproducible research, all code is provided online.

pdf bib abs

DisSim: A Discourse-Aware Syntactic Text Simplification Framework for English and German
Christina Niklaus | Matthias Cetto | André Freitas | Siegfried Handschuh
Proceedings of the 12th International Conference on Natural Language Generation

We introduce DisSim, a discourse-aware sentence splitting framework for English and German whose goal is to transform syntactically complex sentences into an intermediate representation that presents a simple and more regular structure which is easier to process for downstream semantic applications. For this purpose, we turn input sentences into a two-layered semantic hierarchy in the form of core facts and accompanying contexts, while identifying the rhetorical relations that hold between them. In that way, we preserve the coherence structure of the input and, hence, its interpretability for downstream tasks.

2018

pdf bib abs

A Survey on Open Information Extraction
Christina Niklaus | Matthias Cetto | André Freitas | Siegfried Handschuh
Proceedings of the 27th International Conference on Computational Linguistics

We provide a detailed overview of the various approaches that were proposed to date to solve the task of Open Information Extraction. We present the major challenges that such systems face, show the evolution of the suggested approaches over time and depict the specific issues they address. In addition, we provide a critique of the commonly applied evaluation procedures for assessing the performance of Open IE systems and highlight some directions for future work.

pdf bib abs

Graphene: Semantically-Linked Propositions in Open Information Extraction
Matthias Cetto | Christina Niklaus | André Freitas | Siegfried Handschuh
Proceedings of the 27th International Conference on Computational Linguistics

We present an Open Information Extraction (IE) approach that uses a two-layered transformation stage consisting of a clausal disembedding layer and a phrasal disembedding layer, together with rhetorical relation identification. In that way, we convert sentences that present a complex linguistic structure into simplified, syntactically sound sentences, from which we can extract propositions that are represented in a two-layered hierarchy in the form of core relational tuples and accompanying contextual information which are semantically linked via rhetorical relations. In a comparative evaluation, we demonstrate that our reference implementation Graphene outperforms state-of-the-art Open IE systems in the construction of correct n-ary predicate-argument structures. Moreover, we show that existing Open IE approaches can benefit from the transformation process of our framework.

pdf bib abs

Graphene: a Context-Preserving Open Information Extraction System
Matthias Cetto | Christina Niklaus | André Freitas | Siegfried Handschuh
Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations

We introduce Graphene, an Open IE system whose goal is to generate accurate, meaningful and complete propositions that may facilitate a variety of downstream semantic applications. For this purpose, we transform syntactically complex input sentences into clean, compact structures in the form of core facts and accompanying contexts, while identifying the rhetorical relations that hold between them in order to maintain their semantic relationship. In that way, we preserve the context of the relational tuples extracted from a source sentence, generating a novel lightweight semantic representation for Open IE that enhances the expressiveness of the extracted propositions.

2016

pdf bib abs

A Sentence Simplification System for Improving Relation Extraction
Christina Niklaus | Bernhard Bermeitinger | Siegfried Handschuh | André Freitas
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

We present a text simplification approach that is directed at improving the performance of state-of-the-art Open Relation Extraction (RE) systems. As syntactically complex sentences often pose a challenge for current Open RE approaches, we have developed a simplification framework that performs a pre-processing step by taking a single sentence as input and using a set of syntactic-based transformation rules to create a textual input that is easier to process for subsequently applied Open RE systems.

Co-authors

Bernhard Bermeitinger 2

Reto Gubelmann 2

Jan Marco Leimeister 2

Matthias Söllner 2

Ziwei Chen 1

Rositsa V Ivanova 1

Aikaterini-Lida Kalouli 1

Venues