Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries

Marta Villegas, Maite Melero, Núria Bel, Jorge Gracia


Abstract
The experiments presented here exploit the properties of the Apertium RDF Graph, principally cycle density and nodes’ degree, to automatically generate new translation relations between words, and therefore to enrich existing bilingual dictionaries with new entries. Currently, the Apertium RDF Graph includes data from 22 Apertium bilingual dictionaries and constitutes a large unified array of linked lexical entries and translations that are available and accessible on the Web (http://linguistic.linkeddata.es/apertium/). In particular, its graph structure allows for interesting exploitation opportunities, some of which are addressed in this paper. Two ‘massive’ experiments are reported: in the first one, the original EN-ES translation set was removed from the Apertium RDF Graph and a new EN-ES version was generated. The results were compared against the previously removed EN-ES data and against the Concise Oxford Spanish Dictionary. In the second experiment, a new non-existent EN-FR translation set was generated. In this case the results were compared against a converted wiktionary English-French file. The results we got are really good and perform well for the extreme case of correlated polysemy. This lead us to address the possibility to use cycles and nodes degree to identify potential oddities in the source data. If cycle density proves efficient when considering potential targets, we can assume that in dense graphs nodes with low degree may indicate potential errors.
Anthology ID:
L16-1140
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
868–876
Language:
URL:
https://aclanthology.org/L16-1140
DOI:
Bibkey:
Cite (ACL):
Marta Villegas, Maite Melero, Núria Bel, and Jorge Gracia. 2016. Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 868–876, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries (Villegas et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1140.pdf
Code
 martavillegas/ApertiumRDF