Cross-lingual annotation: a road map for low- and no-resource languages

Meagan Vigus, Jens E. L. Van Gysel, Tim O’Gorman, Andrew Cowell, Rosa Vallejos, William Croft


Abstract
This paper presents a “road map” for the annotation of semantic categories in typologically diverse languages, with potentially few linguistic resources, and often no existing computational resources. Past semantic annotation efforts have focused largely on high-resource languages, or relatively low-resource languages with a large number of native speakers. However, there are certain typological traits, namely the synthesis of multiple concepts into a single word, that are more common in languages with a smaller speech community. For example, what is expressed as a sentence in a more analytic language like English, may be expressed as a single word in a more synthetic language like Arapaho. This paper proposes solutions for annotating analytic and synthetic languages in a comparable way based on existing typological research, and introduces a road map for the annotation of languages with a dearth of resources.
Anthology ID:
2020.dmr-1.4
Volume:
Proceedings of the Second International Workshop on Designing Meaning Representations
Month:
December
Year:
2020
Address:
Barcelona Spain (online)
Venues:
COLING | DMR
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
30–40
Language:
URL:
https://aclanthology.org/2020.dmr-1.4
DOI:
Bibkey:
Cite (ACL):
Meagan Vigus, Jens E. L. Van Gysel, Tim O’Gorman, Andrew Cowell, Rosa Vallejos, and William Croft. 2020. Cross-lingual annotation: a road map for low- and no-resource languages. In Proceedings of the Second International Workshop on Designing Meaning Representations, pages 30–40, Barcelona Spain (online). Association for Computational Linguistics.
Cite (Informal):
Cross-lingual annotation: a road map for low- and no-resource languages (Vigus et al., DMR 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.dmr-1.4.pdf