Arne Binder


pdf bib
Full-Text Argumentation Mining on Scientific Publications
Arne Binder | Leonhard Hennig | Bhuvanesh Verma
Proceedings of the first Workshop on Information Extraction from Scientific Publications

Scholarly Argumentation Mining (SAM) has recently gained attention due to its potential to help scholars with the rapid growth of published scientific literature. It comprises two subtasks: argumentative discourse unit recognition (ADUR) and argumentative relation extraction (ARE), both of which are challenging since they require e.g. the integration of domain knowledge, the detection of implicit statements, and the disambiguation of argument structure. While previous work focused on dataset construction and baseline methods for specific document sections, such as abstract or results, full-text scholarly argumentation mining has seen little progress. In this work, we introduce a sequential pipeline model combining ADUR and ARE for full-text SAM, and provide a first analysis of the performance of pretrained language models (PLMs) on both subtasks. We establish a new SotA for ADUR on the Sci-Arg corpus, outperforming the previous best reported result by a large margin (+7% F1). We also present the first results for ARE, and thus for the full AM pipeline, on this benchmark dataset. Our detailed error analysis reveals that non-contiguous ADUs as well as the interpretation of discourse connectors pose major challenges and that data annotation needs to be more consistent.

pdf bib
A Comparative Study of Pre-trained Encoders for Low-Resource Named Entity Recognition
Yuxuan Chen | Jonas Mikkelsen | Arne Binder | Christoph Alt | Leonhard Hennig
Proceedings of the 7th Workshop on Representation Learning for NLP

Pre-trained language models (PLM) are effective components of few-shot named entity recognition (NER) approaches when augmented with continued pre-training on task-specific out-of-domain data or fine-tuning on in-domain data. However, their performance in low-resource scenarios, where such data is not available, remains an open question. We introduce an encoder evaluation framework, and use it to systematically compare the performance of state-of-the-art pre-trained representations on the task of low-resource NER. We analyze a wide range of encoders pre-trained with different strategies, model architectures, intermediate-task fine-tuning, and contrastive learning. Our experimental results across ten benchmark NER datasets in English and German show that encoder performance varies significantly, suggesting that the choice of encoder for a specific low-resource scenario needs to be carefully evaluated.


pdf bib
An Empirical Comparison of Question Classification Methods for Question Answering Systems
Eduardo Cortes | Vinicius Woloszyn | Arne Binder | Tilo Himmelsbach | Dante Barone | Sebastian Möller
Proceedings of the Twelfth Language Resources and Evaluation Conference

Question classification is an important component of Question Answering Systems responsible for identifying the type of an answer a particular question requires. For instance, “Who is the prime minister of the United Kingdom?” demands a name of a PERSON, while “When was the queen of the United Kingdom born?” entails a DATE. This work makes an extensible review of the most recent methods for Question Classification, taking into consideration their applicability in low-resourced languages. First, we propose a manual classification of the current state-of-the-art methods in four distinct categories: low, medium, high, and very high level of dependency on external resources. Second, we applied this categorization in an empirical comparison in terms of the amount of data necessary for training and performance in different languages. In addition to complementing earlier works in this field, our study shows a boost on methods relying on recent language models, overcoming methods not suitable for low-resourced languages.