Evaluating the Wordnet and CoRoLa-based Word Embedding Vectors for Romanian as Resources in the Task of Microworlds Lexicon Expansion

Elena Irimia, Maria Mitrofan, Verginica Mititelu


Abstract
Within a larger frame of facilitating human-robot interaction, we present here the creation of a core vocabulary to be learned by a robot. It is extracted from two tokenised and lemmatized scenarios pertaining to two imagined microworlds in which the robot is supposed to play an assistive role. We also evaluate two resources for their utility for expanding this vocabulary so as to better cope with the robot’s communication needs. The language under study is Romanian and the resources used are the Romanian wordnet and word embedding vectors extracted from the large representative corpus of contemporary Romanian, CoRoLa. The evaluation is made for two situations: one in which the words are not semantically disambiguated before expanding the lexicon, and another one in which they are disambiguated with senses from the Romanian wordnet. The appropriateness of each resource is discussed.
Anthology ID:
2019.gwc-1.22
Volume:
Proceedings of the 10th Global Wordnet Conference
Month:
July
Year:
2019
Address:
Wroclaw, Poland
Venue:
GWC
SIG:
Publisher:
Global Wordnet Association
Note:
Pages:
176–184
Language:
URL:
https://aclanthology.org/2019.gwc-1.22
DOI:
Bibkey:
Cite (ACL):
Elena Irimia, Maria Mitrofan, and Verginica Mititelu. 2019. Evaluating the Wordnet and CoRoLa-based Word Embedding Vectors for Romanian as Resources in the Task of Microworlds Lexicon Expansion. In Proceedings of the 10th Global Wordnet Conference, pages 176–184, Wroclaw, Poland. Global Wordnet Association.
Cite (Informal):
Evaluating the Wordnet and CoRoLa-based Word Embedding Vectors for Romanian as Resources in the Task of Microworlds Lexicon Expansion (Irimia et al., GWC 2019)
Copy Citation:
PDF:
https://aclanthology.org/2019.gwc-1.22.pdf