Abui Wordnet: Using a Toolbox Dictionary to develop a wordnet for a low-resource language

Frantisek Kratochvil, Luís Morgado da Costa


Abstract
This paper describes a procedure to link a Toolbox dictionary of a low-resource language to correct synsets, generating a new wordnet. We introduce a bootstrapping technique utilising the information in the gloss fields (English, national, and regional) to generate sense candidates using a naive algorithm based on multilingual sense intersection. We show that this technique is quite effective when glosses are available in more than one language. Our technique complements the previous work by Rosman et al. (2014) which linked the SIL Semantic Domains to wordnet senses. Through this work we have created a small, fully hand-checked wordnet for Abui, containing over 1,400 concepts and 3,600 senses.
Anthology ID:
2022.fieldmatters-1.7
Volume:
Proceedings of the first workshop on NLP applications to field linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Oleg Serikov, Ekaterina Voloshina, Anna Postnikova, Elena Klyachko, Ekaterina Neminova, Ekaterina Vylomova, Tatiana Shavrina, Eric Le Ferrand, Valentin Malykh, Francis Tyers, Timofey Arkhangelskiy, Vladislav Mikhailov, Alena Fenogenova
Venue:
FieldMatters
SIG:
Publisher:
International Conference on Computational Linguistics
Note:
Pages:
54–63
Language:
URL:
https://aclanthology.org/2022.fieldmatters-1.7
DOI:
Bibkey:
Cite (ACL):
Frantisek Kratochvil and Luís Morgado da Costa. 2022. Abui Wordnet: Using a Toolbox Dictionary to develop a wordnet for a low-resource language. In Proceedings of the first workshop on NLP applications to field linguistics, pages 54–63, Gyeongju, Republic of Korea. International Conference on Computational Linguistics.
Cite (Informal):
Abui Wordnet: Using a Toolbox Dictionary to develop a wordnet for a low-resource language (Kratochvil & Morgado da Costa, FieldMatters 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.fieldmatters-1.7.pdf