Stefan Jänicke


2023

pdf bib
Named Entity Annotation Projection Applied to Classical Languages
Tariq Yousef | Chiara Palladino | Gerhard Heyer | Stefan Jänicke
Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

In this study, we demonstrate how to apply cross-lingual annotation projection to transfer named-entity annotations to classical languages for which limited or no resources and annotated texts are available, aiming to enrich their NER training datasets and train a model to perform NER tagging. Our method uses sentence-level aligned parallel corpora ancient texts and the translation in a modern language, for which high-quality off-the-shelf NER systems are available. We automatically annotate the text of the modern language and employ a state-of-the-art neural word alignment system to find translation equivalents. Finally, we transfer the annotations to the corresponding tokens in the ancient texts using a direct projection heuristic. We applied our method to ancient Greek, Latin, and Arabic using the Bible with the English translation as a parallel corpus. We used the resulting annotations to enhance the performance of an existing NER model for ancient Greek

pdf bib
EVALIGN: Visual Evaluation of Translation Alignment Models
Tariq Yousef | Gerhard Heyer | Stefan Jänicke
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

This paper presents EvAlign, a visual analytics framework for quantitative and qualitative evaluation of automatic translation alignment models. EvAlign offers various visualization views enabling developers to visualize their models’ predictions and compare the performance of their models with other baseline and state-of-the-art models. Through different search and filter functions, researchers and practitioners can also inspect the frequent alignment errors and their positions. EvAlign hosts nine gold standard datasets and the predictions of multiple alignment models. The tool is extendable, and adding additional datasets and models is straightforward. EvAlign can be deployed and used locally and is available on GitHub.

2021

pdf bib
Summary Explorer: Visualizing the State of the Art in Text Summarization
Shahbaz Syed | Tariq Yousef | Khalid Al Khatib | Stefan Jänicke | Martin Potthast
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

This paper introduces Summary Explorer, a new tool to support the manual inspection of text summarization systems by compiling the outputs of 55 state-of-the-art single document summarization approaches on three benchmark datasets, and visually exploring them during a qualitative assessment. The underlying design of the tool considers three well-known summary quality criteria (coverage, faithfulness, and position bias), encapsulated in a guided assessment based on tailored visualizations. The tool complements existing approaches for locally debugging summarization models and improves upon them. The tool is available at https://tldr.webis.de/