Frank Martin Mtumbuka


2024

pdf bib
Entity or Relation Embeddings? An Analysis of Encoding Strategies for Relation Extraction
Frank Martin Mtumbuka | Steven Schockaert
Findings of the Association for Computational Linguistics: EMNLP 2024

Existing approaches to relation extraction obtain relation embeddings by concatenating embeddings of the head and tail entities. Despite the popularity of this approach, we find that such representations mostly capture the types of the entities involved, leading to false positives and confusion between relations that involve entities of the same type. Another possibility is to use a prompt with a [MASK] token to directly learn relation embeddings, but this approach tends to perform poorly. We show that this underperformance comes from the fact that information about entity types is insufficiently captured by the [MASK] embeddings. We therefore propose a simple model, which combines such [MASK] embeddings with entity embeddings. Despite its simplicity, our model consistently outperforms the state-of-the-art across several benchmarks, even when the entity embeddings are obtained from a pre-trained entity typing model. We also experiment with a self-supervised pre-training strategy which further improves the results.