Hannah Sterz


2024

pdf bib
M2QA: Multi-domain Multilingual Question Answering
Leon Engländer | Hannah Sterz | Clifton A Poth | Jonas Pfeiffer | Ilia Kuznetsov | Iryna Gurevych
Findings of the Association for Computational Linguistics: EMNLP 2024

Generalization and robustness to input variation are core desiderata of machine learning research. Language varies along several axes, most importantly, language instance (e.g. French) and domain (e.g. news). While adapting NLP models to new languages within a single domain, or to new domains within a single language, is widely studied, research in joint adaptation is hampered by the lack of evaluation datasets. This prevents the transfer of NLP systems from well-resourced languages and domains to non-dominant language-domain combinations. To address this gap, we introduce M2QA, a multi-domain multilingual question answering benchmark.M2QA includes 13,500 SQuAD 2.0-style question-answer instances in German, Turkish, and Chinese for the domains of product reviews, news, and creative writing. We use M2QA to explore cross-lingual cross-domain performance of fine-tuned models and state-of-the-art LLMs and investigate modular approaches to domain and language adaptation.We witness **1)** considerable performance _variations_ across domain-language combinations within model classes and **2)** considerable performance _drops_ between source and target language-domain combinations across all model sizes. We demonstrate that M2QA is far from solved, and new methods to effectively transfer both linguistic and domain-specific information are necessary.

2023

pdf bib
ML Mob at SemEval-2023 Task 1: Probing CLIP on Visual Word-Sense Disambiguation
Clifton Poth | Martin Hentschel | Tobias Werner | Hannah Sterz | Leonard Bongard
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

Successful word sense disambiguation (WSD)is a fundamental element of natural languageunderstanding. As part of SemEval-2023 Task1, we investigate WSD in a multimodal setting,where ambiguous words are to be matched withcandidate images representing word senses. Wecompare multiple systems based on pre-trainedCLIP models. In our experiments, we findCLIP to have solid zero-shot performance onmonolingual and multilingual data. By em-ploying different fine-tuning techniques, we areable to further enhance performance. However,transferring knowledge between data distribu-tions proves to be more challenging.

pdf bib
ML Mob at SemEval-2023 Task 5: “Breaking News: Our Semi-Supervised and Multi-Task Learning Approach Spoils Clickbait”
Hannah Sterz | Leonard Bongard | Tobias Werner | Clifton Poth | Martin Hentschel
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

Online articles using striking headlines that promise intriguing information are often used to attract readers. Most of the time, the information provided in the text is disappointing to the reader after the headline promised exciting news. As part of the SemEval-2023 challenge, we propose a system to generate a spoiler for these headlines. The spoiler provides the information promised by the headline and eliminates the need to read the full article. We consider Multi-Task Learning and generating more data using a distillation approach in our system. With this, we achieve an F1 score up to 51.48% on extracting the spoiler from the articles.

pdf bib
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Clifton Poth | Hannah Sterz | Indraneil Paul | Sukannya Purkayastha | Leon Engländer | Timo Imhof | Ivan Vulić | Sebastian Ruder | Iryna Gurevych | Jonas Pfeiffer
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

We introduce Adapters, an open-source library that unifies parameter-efficient and modular transfer learning in large language models. By integrating 10 diverse adapter methods into a unified interface, Adapters offers ease of use and flexible configuration. Our library allows researchers and practitioners to leverage adapter modularity through composition blocks, enabling the design of complex adapter setups. We demonstrate the library’s efficacy by evaluating its performance against full fine-tuning on various NLP tasks. Adapters provides a powerful tool for addressing the challenges of conventional fine-tuning paradigms and promoting more efficient and modular transfer learning. The library is available via https://adapterhub.ml/adapters.

2022

pdf bib
UKP-SQUARE: An Online Platform for Question Answering Research
Tim Baumgärtner | Kexin Wang | Rachneet Sachdeva | Gregor Geigle | Max Eichler | Clifton Poth | Hannah Sterz | Haritz Puerto | Leonardo F. R. Ribeiro | Jonas Pfeiffer | Nils Reimers | Gözde Şahin | Iryna Gurevych
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

Recent advances in NLP and information retrieval have given rise to a diverse set of question answering tasks that are of different formats (e.g., extractive, abstractive), require different model architectures (e.g., generative, discriminative), and setups (e.g., with or without retrieval). Despite having a large number of powerful, specialized QA pipelines (which we refer to as Skills) that consider a single domain, model or setup, there exists no framework where users can easily explore and compare such pipelines and can extend them according to their needs. To address this issue, we present UKP-SQuARE, an extensible online QA platform for researchers which allows users to query and analyze a large collection of modern Skills via a user-friendly web interface and integrated behavioural tests. In addition, QA researchers can develop, manage, and share their custom Skills using our microservices that support a wide range of models (Transformers, Adapters, ONNX), datastores and retrieval techniques (e.g., sparse and dense). UKP-SQuARE is available on https://square.ukp-lab.de