Mireia Roig Mirapeix


2020

pdf bib
Definition Extraction Feature Analysis: From Canonical to Naturally-Occurring Definitions
Mireia Roig Mirapeix | Luis Espinosa Anke | Jose Camacho-Collados
Proceedings of the Workshop on the Cognitive Aspects of the Lexicon

Textual definitions constitute a fundamental source of knowledge when seeking the meaning of words, and they are the cornerstone of lexical resources like glossaries, dictionaries, encyclopedia or thesauri. In this paper, we present an in-depth analytical study on the main features relevant to the task of definition extraction. Our main goal is to study whether linguistic structures from canonical (the Aristotelian or genus et differentia model) can be leveraged to retrieve definitions from corpora in different domains of knowledge and textual genres alike. To this end, we develop a simple linear classifier and analyze the contribution of several (sets of) linguistic features. Finally, as a result of our experiments, we also shed light on the particularities of existing benchmarks as well as the most challenging aspects of the task.