Daniel Ruffinelli

2022

pdf bib abs
KGxBoard: Explainable and Interactive Leaderboard for Evaluation of Knowledge Graph Completion Models
Haris Widjaja | Kiril Gashteovski | Wiem Ben Rim | Pengfei Liu | Christopher Malon | Daniel Ruffinelli | Carolin Lawrence | Graham Neubig
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Knowledge Graphs (KGs) store information in the form of (head, predicate, tail)-triples. To augment KGs with new knowledge, researchers proposed models for KG Completion (KGC) tasks such as link prediction; i.e., answering (h; p; ?) or (?; p; t) queries. Such models are usually evaluated with averaged metrics on a held-out test set. While useful for tracking progress, averaged single-score metrics cannotreveal what exactly a model has learned — or failed to learn. To address this issue, we propose KGxBoard: an interactive framework for performing fine-grained evaluation on meaningful subsets of the data, each of which tests individual and interpretable capabilities of a KGC model. In our experiments, we highlight the findings that we discovered with the use of KGxBoard, which would have been impossible to detect with standard averaged single-score metrics.

2020

pdf bib abs
LibKGE - A knowledge graph embedding library for reproducible research
Samuel Broscheit | Daniel Ruffinelli | Adrian Kochsiek | Patrick Betz | Rainer Gemulla
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

LibKGE ( https://github.com/uma-pi1/kge ) is an open-source PyTorch-based library for training, hyperparameter optimization, and evaluation of knowledge graph embedding models for link prediction. The key goals of LibKGE are to enable reproducible research, to provide a framework for comprehensive experimental studies, and to facilitate analyzing the contributions of individual components of training methods, model architectures, and evaluation methods. LibKGE is highly configurable and every experiment can be fully reproduced with a single configuration file. Individual components are decoupled to the extent possible so that they can be mixed and matched with each other. Implementations in LibKGE aim to be as efficient as possible without leaving the scope of Python/Numpy/PyTorch. A comprehensive logging mechanism and tooling facilitates in-depth analysis. LibKGE provides implementations of common knowledge graph embedding models and training methods, and new ones can be easily added. A comparative study (Ruffinelli et al., 2020) showed that LibKGE reaches competitive to state-of-the-art performance for many models with a modest amount of automatic hyperparameter tuning.

2019

pdf bib abs
On Evaluating Embedding Models for Knowledge Base Completion
Yanjie Wang | Daniel Ruffinelli | Rainer Gemulla | Samuel Broscheit | Christian Meilicke
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

Knowledge graph embedding models have recently received significant attention in the literature. These models learn latent semantic representations for the entities and relations in a given knowledge base; the representations can be used to infer missing knowledge. In this paper, we study the question of how well recent embedding models perform for the task of knowledge base completion, i.e., the task of inferring new facts from an incomplete knowledge base. We argue that the entity ranking protocol, which is currently used to evaluate knowledge graph embedding models, is not suitable to answer this question since only a subset of the model predictions are evaluated. We propose an alternative entity-pair ranking protocol that considers all model predictions as a whole and is thus more suitable to the task. We conducted an experimental study on standard datasets and found that the performance of popular embeddings models was unsatisfactory under the new protocol, even on datasets that are generally considered to be too easy. Moreover, we found that a simple rule-based model often provided superior performance. Our findings suggest that there is a need for more research into embedding models as well as their training strategies for the task of knowledge base completion.

Co-authors

Venues

emnlp2
repl4nlp1