Marc Hübner


2020

pdf bib
Defx at SemEval-2020 Task 6: Joint Extraction of Concepts and Relations for Definition Extraction
Marc Hübner | Christoph Alt | Robert Schwarzenberg | Leonhard Hennig
Proceedings of the Fourteenth Workshop on Semantic Evaluation

Definition Extraction systems are a valuable knowledge source for both humans and algorithms. In this paper we describe our submissions to the DeftEval shared task (SemEval-2020 Task 6), which is evaluated on an English textbook corpus. We provide a detailed explanation of our system for the joint extraction of definition concepts and the relations among them. Furthermore we provide an ablation study of our model variations and describe the results of an error analysis.

2019

pdf bib
Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction
Christoph Alt | Marc Hübner | Leonhard Hennig
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Distantly supervised relation extraction is widely used to extract relational facts from text, but suffers from noisy labels. Current relation extraction methods try to alleviate the noise by multi-instance learning and by providing supporting linguistic and contextual information to more efficiently guide the relation classification. While achieving state-of-the-art results, we observed these models to be biased towards recognizing a limited set of relations with high precision, while ignoring those in the long tail. To address this gap, we utilize a pre-trained language model, the OpenAI Generative Pre-trained Transformer (GPT) (Radford et al., 2018). The GPT and similar models have been shown to capture semantic and syntactic features, and also a notable amount of “common-sense” knowledge, which we hypothesize are important features for recognizing a more diverse set of relations. By extending the GPT to the distantly supervised setting, and fine-tuning it on the NYT10 dataset, we show that it predicts a larger set of distinct relation types with high confidence. Manual and automated evaluation of our model shows that it achieves a state-of-the-art AUC score of 0.422 on the NYT10 dataset, and performs especially well at higher recall levels.

pdf bib
Layerwise Relevance Visualization in Convolutional Text Graph Classifiers
Robert Schwarzenberg | Marc Hübner | David Harbecke | Christoph Alt | Leonhard Hennig
Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13)

Representations in the hidden layers of Deep Neural Networks (DNN) are often hard to interpret since it is difficult to project them into an interpretable domain. Graph Convolutional Networks (GCN) allow this projection, but existing explainability methods do not exploit this fact, i.e. do not focus their explanations on intermediate states. In this work, we present a novel method that traces and visualizes features that contribute to a classification decision in the visible and hidden layers of a GCN. Our method exposes hidden cross-layer dynamics in the input graph structure. We experimentally demonstrate that it yields meaningful layerwise explanations for a GCN sentence classifier.

2018

pdf bib
Learning Comment Controversy Prediction in Web Discussions Using Incidentally Supervised Multi-Task CNNs
Nils Rethmeier | Marc Hübner | Leonhard Hennig
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

Comments on web news contain controversies that manifest as inter-group agreement-conflicts. Tracking such rapidly evolving controversy could ease conflict resolution or journalist-user interaction. However, this presupposes controversy online-prediction that scales to diverse domains using incidental supervision signals instead of manual labeling. To more deeply interpret comment-controversy model decisions we frame prediction as binary classification and evaluate baselines and multi-task CNNs that use an auxiliary news-genre-encoder. Finally, we use ablation and interpretability methods to determine the impacts of topic, discourse and sentiment indicators, contextual vs. global word influence, as well as genre-keywords vs. per-genre-controversy keywords – to find that the models learn plausible controversy features using only incidentally supervised signals.