LISN @ SIGMORPHON 2023 Shared Task on Interlinear Glossing

Shu Okabe, François Yvon


Abstract
This paper describes LISN”’“s submission to the second track (open track) of the shared task on Interlinear Glossing for SIGMORPHON 2023. Our systems are based on Lost, a variation of linear Conditional Random Fields initially developed as a probabilistic translation model and then adapted to the glossing task. This model allows us to handle one of the main challenges posed by glossing, i.e. the fact that the list of potential labels for lexical morphemes is not fixed in advance and needs to be extended dynamically when labelling units are not seen in training. In such situations, we show how to make use of candidate lexical glosses found in the translation and discuss how such extension affects the training and inference procedures. The resulting automatic glossing systems prove to yield very competitive results, especially in low-resource settings.
Anthology ID:
2023.sigmorphon-1.21
Volume:
Proceedings of the 20th SIGMORPHON workshop on Computational Research in Phonetics, Phonology, and Morphology
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Garrett Nicolai, Eleanor Chodroff, Frederic Mailhot, Çağrı Çöltekin
Venue:
SIGMORPHON
SIG:
SIGMORPHON
Publisher:
Association for Computational Linguistics
Note:
Pages:
202–208
Language:
URL:
https://aclanthology.org/2023.sigmorphon-1.21
DOI:
10.18653/v1/2023.sigmorphon-1.21
Bibkey:
Cite (ACL):
Shu Okabe and François Yvon. 2023. LISN @ SIGMORPHON 2023 Shared Task on Interlinear Glossing. In Proceedings of the 20th SIGMORPHON workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 202–208, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
LISN @ SIGMORPHON 2023 Shared Task on Interlinear Glossing (Okabe & Yvon, SIGMORPHON 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.sigmorphon-1.21.pdf