Marcel Bollmann


pdf bib
Two Decades of the ACL Anthology: Development, Impact, and Open Challenges
Marcel Bollmann | Nathan Schneider | Arne Köhn | Matt Post
Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)

The ACL Anthology is a prime resource for research papers within computational linguistics and natural language processing, while continuing to be an open-source and community-driven project. Since Gildea et al. (2018) reported on its state and planned directions, the Anthology has seen major technical changes. We discuss what led to these changes and how they impact long-term maintainability and community engagement, describe which open-source data and software tools the Anthology currently provides, and provide a survey of literature that has used the Anthology as a main data source.


pdf bib
How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task
Rahul Aralikatte | Héctor Ricardo Murrieta Bello | Miryam de Lhoneux | Daniel Hershcovich | Marcel Bollmann | Anders Søgaard
Proceedings of the 8th Workshop on Asian Translation (WAT2021)

This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known shortcomings of translation systems and metrics.

pdf bib
Moses and the Character-Based Random Babbling Baseline: CoAStaL at AmericasNLP 2021 Shared Task
Marcel Bollmann | Rahul Aralikatte | Héctor Murrieta Bello | Daniel Hershcovich | Miryam de Lhoneux | Anders Søgaard
Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas

We evaluated a range of neural machine translation techniques developed specifically for low-resource scenarios. Unsuccessfully. In the end, we submitted two runs: (i) a standard phrase-based model, and (ii) a random babbling baseline using character trigrams. We found that it was surprisingly hard to beat (i), in spite of this model being, in theory, a bad fit for polysynthetic languages; and more interestingly, that (ii) was better than several of the submitted systems, highlighting how difficult low-resource machine translation for polysynthetic languages is.

pdf bib
Error Analysis and the Role of Morphology
Marcel Bollmann | Anders Søgaard
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

We evaluate two common conjectures in error analysis of NLP models: (i) Morphology is predictive of errors; and (ii) the importance of morphology increases with the morphological complexity of a language. We show across four different tasks and up to 57 languages that of these conjectures, somewhat surprisingly, only (i) is true. Using morphological features does improve error prediction across tasks; however, this effect is less pronounced with morphologically complex languages. We speculate this is because morphology is more discriminative in morphologically simple languages. Across all four tasks, case and gender are the morphological features most predictive of error.


pdf bib
On Forgetting to Cite Older Papers: An Analysis of the ACL Anthology
Marcel Bollmann | Desmond Elliott
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The field of natural language processing is experiencing a period of unprecedented growth, and with it a surge of published papers. This represents an opportunity for us to take stock of how we cite the work of other researchers, and whether this growth comes at the expense of “forgetting” about older literature. In this paper, we address this question through bibliographic analysis. By looking at the age of outgoing citations in papers published at selected ACL venues between 2010 and 2019, we find that there is indeed a tendency for recent papers to cite more recent work, but the rate at which papers older than 15 years are cited has remained relatively stable.


pdf bib
Naive Regularizers for Low-Resource Neural Machine Translation
Meriem Beloucif | Ana Valeria Gonzalez | Marcel Bollmann | Anders Søgaard
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

Neural machine translation models have little inductive bias, which can be a disadvantage in low-resource scenarios. Neural models have to be trained on large amounts of data and have been shown to perform poorly when only limited data is available. We show that using naive regularization methods, based on sentence length, punctuation and word frequencies, to penalize translations that are very different from the input sentences, consistently improves the translation quality across multiple low-resource languages. We experiment with 12 language pairs, varying the training data size between 17k to 230k sentence pairs. Our best regularizer achieves an average increase of 1.5 BLEU score and 1.0 TER score across all the language pairs. For example, we achieve a BLEU score of 26.70 on the IWSLT15 English–Vietnamese translation task simply by using relative differences in punctuation as a regularizer.

pdf bib
A Large-Scale Comparison of Historical Text Normalization Systems
Marcel Bollmann
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

There is no consensus on the state-of-the-art approach to historical text normalization. Many techniques have been proposed, including rule-based methods, distance metrics, character-based statistical machine translation, and neural encoder–decoder models, but studies have used different datasets, different evaluation methods, and have come to different conclusions. This paper presents the largest study of historical text normalization done so far. We critically survey the existing literature and report experiments on eight languages, comparing systems spanning all categories of proposed normalization techniques, analysing the effect of training data quantity, and using different evaluation methods. The datasets and scripts are made publicly available.

pdf bib
Few-Shot and Zero-Shot Learning for Historical Text Normalization
Marcel Bollmann | Natalia Korchagina | Anders Søgaard
Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)

Historical text normalization often relies on small training datasets. Recent work has shown that multi-task learning can lead to significant improvements by exploiting synergies with related datasets, but there has been no systematic study of different multi-task learning architectures. This paper evaluates 63 multi-task learning configurations for sequence-to-sequence-based historical text normalization across ten datasets from eight languages, using autoencoding, grapheme-to-phoneme mapping, and lemmatization as auxiliary tasks. We observe consistent, significant improvements across languages when training data for the target task is limited, but minimal or no improvements when training data is abundant. We also show that zero-shot learning outperforms the simple, but relatively strong, identity baseline.

pdf bib
Historical Text Normalization with Delayed Rewards
Simon Flachs | Marcel Bollmann | Anders Søgaard
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Training neural sequence-to-sequence models with simple token-level log-likelihood is now a standard approach to historical text normalization, albeit often outperformed by phrase-based models. Policy gradient training enables direct optimization for exact matches, and while the small datasets in historical text normalization are prohibitive of from-scratch reinforcement learning, we show that policy gradient fine-tuning leads to significant improvements across the board. Policy gradient training, in particular, leads to more accurate normalizations for long or unseen words.


pdf bib
Multi-task learning for historical text normalization: Size matters
Marcel Bollmann | Anders Søgaard | Joachim Bingel
Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP

Historical text normalization suffers from small datasets that exhibit high variance, and previous work has shown that multi-task learning can be used to leverage data from related problems in order to obtain more robust models. Previous work has been limited to datasets from a specific language and a specific historical period, and it is not clear whether results generalize. It therefore remains an open problem, when historical text normalization benefits from multi-task learning. We explore the benefits of multi-task learning across 10 different datasets, representing different languages and periods. Our main finding—contrary to what has been observed for other NLP tasks—is that multi-task learning mainly works when target task data is very scarce.


pdf bib
Learning attention for historical text normalization by learning to pronounce
Marcel Bollmann | Joachim Bingel | Anders Søgaard
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Automated processing of historical texts often relies on pre-normalization to modern word forms. Training encoder-decoder architectures to solve such problems typically requires a lot of training data, which is not available for the named task. We address this problem by using several novel encoder-decoder architectures, including a multi-task learning (MTL) architecture using a grapheme-to-phoneme dictionary as auxiliary data, pushing the state-of-the-art by an absolute 2% increase in performance. We analyze the induced models across 44 different texts from Early New High German. Interestingly, we observe that, as previously conjectured, multi-task learning can learn to focus attention during decoding, in ways remarkably similar to recently proposed attention mechanisms. This, we believe, is an important step toward understanding how MTL works.


pdf bib
Evaluating Inter-Annotator Agreement on Historical Spelling Normalization
Marcel Bollmann | Stefanie Dipper | Florian Petran
Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016)

pdf bib
Improving historical spelling normalization with bi-directional LSTMs and multi-task learning
Marcel Bollmann | Anders Søgaard
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Natural-language processing of historical documents is complicated by the abundance of variant spellings and lack of annotated data. A common approach is to normalize the spelling of historical words to modern forms. We explore the suitability of a deep neural network architecture for this task, particularly a deep bi-LSTM network applied on a character level. Our model compares well to previously established normalization algorithms when evaluated on a diverse set of texts from Early New High German. We show that multi-task learning with additional normalization data can improve our model’s performance further.


pdf bib
CorA: A web-based annotation tool for historical and other non-standard language data
Marcel Bollmann | Florian Petran | Stefanie Dipper | Julia Krasselt
Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH)


pdf bib
POS Tagging for Historical Texts with Sparse Training Data
Marcel Bollmann
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse


pdf bib
Adapting SimpleNLG to German
Marcel Bollmann
Proceedings of the 13th European Workshop on Natural Language Generation

pdf bib
Rule-Based Normalization of Historical Texts
Marcel Bollmann | Florian Petran | Stefanie Dipper
Proceedings of the Workshop on Language Technologies for Digital Humanities and Cultural Heritage