Tobias Klatt


pdf bib
TrainX – Named Entity Linking with Active Sampling and Bi-Encoders
Tom Oberhauser | Tim Bischoff | Karl Brendel | Maluna Menke | Tobias Klatt | Amy Siu | Felix Alexander Gers | Alexander Löser
Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations

We demonstrate TrainX, a system for Named Entity Linking for medical experts. It combines state-of-the-art entity recognition and linking architectures, such as Flair and fine-tuned Bi-Encoders based on BERT, with an easy-to-use interface for healthcare professionals. We support medical experts in annotating training data by using active sampling strategies to forward informative samples to the annotator. We demonstrate that our model is capable of linking against large knowledge bases, such as UMLS (3.6 million entities), and supporting zero-shot cases, where the linker has never seen the entity before. Those zero-shot capabilities help to mitigate the problem of rare and expensive training data that is a common issue in the medical domain.


pdf bib
Analysing Errors of Open Information Extraction Systems
Rudolf Schneider | Tom Oberhauser | Tobias Klatt | Felix A. Gers | Alexander Löser
Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems

We report results on benchmarking Open Information Extraction (OIE) systems using RelVis, a toolkit for benchmarking Open Information Extraction systems. Our comprehensive benchmark contains three data sets from the news domain and one data set from Wikipedia with overall 4522 labeled sentences and 11243 binary or n-ary OIE relations. In our analysis on these data sets we compared the performance of four popular OIE systems, ClausIE, OpenIE 4.2, Stanford OpenIE and PredPatt. In addition, we evaluated the impact of five common error classes on a subset of 749 n-ary tuples. From our deep analysis we unreveal important research directions for a next generation on OIE systems.