LessLex: Linking Multilingual Embeddings to SenSe Representations of LEXical Items

Davide Colla, Enrico Mensa, Daniele P. Radicioni


Abstract
We present LESSLEX, a novel multilingual lexical resource. Different from the vast majority of existing approaches, we ground our embeddings on a sense inventory made available from the BabelNet semantic network. In this setting, multilingual access is governed by the mapping of terms onto their underlying sense descriptions, such that all vectors co-exist in the same semantic space. As a result, for each term we have thus the “blended” terminological vector along with those describing all senses associated to that term. LESSLEX has been tested on three tasks relevant to lexical semantics: conceptual similarity, contextual similarity, and semantic text similarity. We experimented over the principal data sets for such tasks in their multilingual and crosslingual variants, improving on or closely approaching state-of-the-art results. We conclude by arguing that LESSLEX vectors may be relevant for practical applications and for research on conceptual and lexical access and competence.
Anthology ID:
2020.cl-2.3
Volume:
Computational Linguistics, Volume 46, Issue 2 - June 2020
Month:
June
Year:
2020
Address:
Venue:
CL
SIG:
Publisher:
Note:
Pages:
289–333
Language:
URL:
https://aclanthology.org/2020.cl-2.3
DOI:
10.1162/coli_a_00375
Bibkey:
Cite (ACL):
Davide Colla, Enrico Mensa, and Daniele P. Radicioni. 2020. LessLex: Linking Multilingual Embeddings to SenSe Representations of LEXical Items. Computational Linguistics, 46(2):289–333.
Cite (Informal):
LessLex: Linking Multilingual Embeddings to SenSe Representations of LEXical Items (Colla et al., CL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.cl-2.3.pdf