Improving Wordnets for Under-Resourced Languages Using Machine Translation

Bharathi Raja Chakravarthi, Mihael Arcan, John P. McCrae


Abstract
Wordnets are extensively used in natural language processing, but the current approaches for manually building a wordnet from scratch involves large research groups for a long period of time, which are typically not available for under-resourced languages. Even if wordnet-like resources are available for under-resourced languages, they are often not easily accessible, which can alter the results of applications using these resources. Our proposed method presents an expand approach for improving and generating wordnets with the help of machine translation. We apply our methods to improve and extend wordnets for the Dravidian languages, i.e., Tamil, Telugu, Kannada, which are severly under-resourced languages. We report evaluation results of the generated wordnet senses in term of precision for these languages. In addition to that, we carried out a manual evaluation of the translations for the Tamil language, where we demonstrate that our approach can aid in improving wordnet resources for under-resourced Dravidian languages.
Anthology ID:
2018.gwc-1.10
Volume:
Proceedings of the 9th Global Wordnet Conference
Month:
January
Year:
2018
Address:
Nanyang Technological University (NTU), Singapore
Editors:
Francis Bond, Piek Vossen, Christiane Fellbaum
Venue:
GWC
SIG:
SIGLEX
Publisher:
Global Wordnet Association
Note:
Pages:
77–86
Language:
URL:
https://aclanthology.org/2018.gwc-1.10
DOI:
Bibkey:
Cite (ACL):
Bharathi Raja Chakravarthi, Mihael Arcan, and John P. McCrae. 2018. Improving Wordnets for Under-Resourced Languages Using Machine Translation. In Proceedings of the 9th Global Wordnet Conference, pages 77–86, Nanyang Technological University (NTU), Singapore. Global Wordnet Association.
Cite (Informal):
Improving Wordnets for Under-Resourced Languages Using Machine Translation (Chakravarthi et al., GWC 2018)
Copy Citation:
PDF:
https://aclanthology.org/2018.gwc-1.10.pdf