Joachim Bingel


pdf bib
CoAStaL at SemEval-2019 Task 3: Affect Classification in Dialogue using Attentive BiLSTMs
Ana Valeria González | Victor Petrén Bach Hansen | Joachim Bingel | Anders Søgaard
Proceedings of the 13th International Workshop on Semantic Evaluation

This work describes the system presented by the CoAStaL Natural Language Processing group at University of Copenhagen. The main system we present uses the same attention mechanism presented in (Yang et al., 2016). Our overall model architecture is also inspired by their hierarchical classification model and adapted to deal with classification in dialogue by encoding information at the turn level. We use different encodings for each turn to create a more expressive representation of dialogue context which is then fed into our classifier. We also define a custom preprocessing step in order to deal with language commonly used in interactions across many social media outlets. Our proposed system achieves a micro F1 score of 0.7340 on the test set and shows significant gains in performance compared to a system using dialogue level encoding.


pdf bib
Predicting misreadings from gaze in children with reading difficulties
Joachim Bingel | Maria Barrett | Sigrid Klerke
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

We present the first work on predicting reading mistakes in children with reading difficulties based on eye-tracking data from real-world reading teaching. Our approach employs several linguistic and gaze-based features to inform an ensemble of different classifiers, including multi-task learning models that let us transfer knowledge about individual readers to attain better predictions. Notably, the data we use in this work stems from noisy readings in the wild, outside of controlled lab conditions. Our experiments show that despite the noise and despite the small fraction of misreadings, gaze data improves the performance more than any other feature group and our models achieve good performance. We further show that gaze patterns for misread words do not fully generalize across readers, but that we can transfer some knowledge between readers using multitask learning at least in some cases. Applications of our models include partial automation of reading assessment as well as personalized text simplification.

pdf bib
Cross-lingual complex word identification with multitask learning
Joachim Bingel | Johannes Bjerva
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

We approach the 2018 Shared Task on Complex Word Identification by leveraging a cross-lingual multitask learning approach. Our method is highly language agnostic, as evidenced by the ability of our system to generalize across languages, including languages for which we have no training data. In the shared task, this is the case for French, for which our system achieves the best performance. We further provide a qualitative and quantitative analysis of which words pose problems for our system.

pdf bib
Multi-task learning for historical text normalization: Size matters
Marcel Bollmann | Anders Søgaard | Joachim Bingel
Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP

Historical text normalization suffers from small datasets that exhibit high variance, and previous work has shown that multi-task learning can be used to leverage data from related problems in order to obtain more robust models. Previous work has been limited to datasets from a specific language and a specific historical period, and it is not clear whether results generalize. It therefore remains an open problem, when historical text normalization benefits from multi-task learning. We explore the benefits of multi-task learning across 10 different datasets, representing different languages and periods. Our main finding—contrary to what has been observed for other NLP tasks—is that multi-task learning mainly works when target task data is very scarce.

pdf bib
Lexi: A tool for adaptive, personalized text simplification
Joachim Bingel | Gustavo Paetzold | Anders Søgaard
Proceedings of the 27th International Conference on Computational Linguistics

Most previous research in text simplification has aimed to develop generic solutions, assuming very homogeneous target audiences with consistent intra-group simplification needs. We argue that this assumption does not hold, and that instead we need to develop simplification systems that adapt to the individual needs of specific users. As a first step towards personalized simplification, we propose a framework for adaptive lexical simplification and introduce Lexi, a free open-source and easily extensible tool for adaptive, personalized text simplification. Lexi is easily installed as a browser extension, enabling easy access to the service for its users.

pdf bib
Sequence Classification with Human Attention
Maria Barrett | Joachim Bingel | Nora Hollenstein | Marek Rei | Anders Søgaard
Proceedings of the 22nd Conference on Computational Natural Language Learning

Learning attention functions requires large volumes of data, but many NLP tasks simulate human behavior, and in this paper, we show that human attention really does provide a good inductive bias on many attention functions in NLP. Specifically, we use estimated human attention derived from eye-tracking corpora to regularize attention functions in recurrent neural networks. We show substantial improvements across a range of tasks, including sentiment analysis, grammatical error detection, and detection of abusive language.


pdf bib
Identifying beneficial task relations for multi-task learning in deep neural networks
Joachim Bingel | Anders Søgaard
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Multi-task learning (MTL) in deep neural networks for NLP has recently received increasing interest due to some compelling benefits, including its potential to efficiently regularize models and to reduce the need for labeled data. While it has brought significant improvements in a number of NLP tasks, mixed results have been reported, and little is known about the conditions under which MTL leads to gains in NLP. This paper sheds light on the specific task relations that can lead to gains from MTL models over single-task setups.

pdf bib
Learning How to Simplify From Explicit Labeling of Complex-Simplified Text Pairs
Fernando Alva-Manchego | Joachim Bingel | Gustavo Paetzold | Carolina Scarton | Lucia Specia
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Current research in text simplification has been hampered by two central problems: (i) the small amount of high-quality parallel simplification data available, and (ii) the lack of explicit annotations of simplification operations, such as deletions or substitutions, on existing data. While the recently introduced Newsela corpus has alleviated the first problem, simplifications still need to be learned directly from parallel text using black-box, end-to-end approaches rather than from explicit annotations. These complex-simple parallel sentence pairs often differ to such a high degree that generalization becomes difficult. End-to-end models also make it hard to interpret what is actually learned from data. We propose a method that decomposes the task of TS into its sub-problems. We devise a way to automatically identify operations in a parallel corpus and introduce a sequence-labeling approach based on these annotations. Finally, we provide insights on the types of transformations that different approaches can model.

pdf bib
Learning attention for historical text normalization by learning to pronounce
Marcel Bollmann | Joachim Bingel | Anders Søgaard
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Automated processing of historical texts often relies on pre-normalization to modern word forms. Training encoder-decoder architectures to solve such problems typically requires a lot of training data, which is not available for the named task. We address this problem by using several novel encoder-decoder architectures, including a multi-task learning (MTL) architecture using a grapheme-to-phoneme dictionary as auxiliary data, pushing the state-of-the-art by an absolute 2% increase in performance. We analyze the induced models across 44 different texts from Early New High German. Interestingly, we observe that, as previously conjectured, multi-task learning can learn to focus attention during decoding, in ways remarkably similar to recently proposed attention mechanisms. This, we believe, is an important step toward understanding how MTL works.


pdf bib
CoastalCPH at SemEval-2016 Task 11: The importance of designing your Neural Networks right
Joachim Bingel | Natalie Schluter | Héctor Martínez Alonso
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
KorAP Architecture ― Diving in the Deep Sea of Corpus Data
Nils Diewald | Michael Hanl | Eliza Margaretha | Joachim Bingel | Marc Kupietz | Piotr Bański | Andreas Witt
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

KorAP is a corpus search and analysis platform, developed at the Institute for the German Language (IDS). It supports very large corpora with multiple annotation layers, multiple query languages, and complex licensing scenarios. KorAP’s design aims to be scalable, flexible, and sustainable to serve the German Reference Corpus DeReKo for at least the next decade. To meet these requirements, we have adopted a highly modular microservice-based architecture. This paper outlines our approach: An architecture consisting of small components that are easy to extend, replace, and maintain. The components include a search backend, a user and corpus license management system, and a web-based user frontend. We also describe a general corpus query protocol used by all microservices for internal communications. KorAP is open source, licensed under BSD-2, and available on GitHub.

pdf bib
Extracting token-level signals of syntactic processing from fMRI - with an application to PoS induction
Joachim Bingel | Maria Barrett | Anders Søgaard
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Text Simplification as Tree Labeling
Joachim Bingel | Anders Søgaard
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Weakly Supervised Part-of-speech Tagging Using Eye-tracking Data
Maria Barrett | Joachim Bingel | Frank Keller | Anders Søgaard
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)


pdf bib
Named Entity Tagging a Very Large Unbalanced Corpus: Training and Evaluating NE Classifiers
Joachim Bingel | Thomas Haider
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We describe a systematic and application-oriented approach to training and evaluating named entity recognition and classification (NERC) systems, the purpose of which is to identify an optimal system and to train an optimal model for named entity tagging DeReKo, a very large general-purpose corpus of contemporary German (Kupietz et al., 2010). DeReKo ‘s strong dispersion wrt. genre, register and time forces us to base our decision for a specific NERC system on an evaluation performed on a representative sample of DeReKo instead of performance figures that have been reported for the individual NERC systems when evaluated on more uniform and less diverse data. We create and manually annotate such a representative sample as evaluation data for three different NERC systems, for each of which various models are learnt on multiple training data. The proposed sampling method can be viewed as a generally applicable method for sampling evaluation data from an unbalanced target corpus for any sort of natural language processing.