Evaluating Cross Lingual Transfer for Morphological Analysis: a Case Study of Indian Languages

Siddhesh Pawar, Pushpak Bhattacharyya, Partha Talukdar


Abstract
Recent advances in pretrained multilingual models such as Multilingual T5 (mT5) have facilitated cross-lingual transfer by learning shared representations across languages. Leveraging pretrained multilingual models for scaling morphology analyzers to low-resource languages is a unique opportunity that has been under-explored so far. We investigate this line of research in the context of Indian languages, focusing on two important morphological sub-tasks: root word extraction and tagging morphosyntactic descriptions (MSD), viz., gender, number, and person (GNP). We experiment with six Indian languages from two language families (Dravidian and Indo-Aryan) to train a multilingual morphology analyzers for the first time for Indian languages. We demonstrate the usability of multilingual models for few-shot cross-lingual transfer through an average 7% increase in GNP tagging in a cross-lingual setting as compared to a monolingual setting through controlled experiments. We provide an overview of the state of the datasets available related to our tasks and point-out a few modeling limitations due to datasets. Lastly, we analyze the cross-lingual transfer of morphological tags for verbs and nouns, which provides a proxy for the quality of representations of word markings learned by the model.
Anthology ID:
2023.sigmorphon-1.3
Volume:
Proceedings of the 20th SIGMORPHON workshop on Computational Research in Phonetics, Phonology, and Morphology
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Garrett Nicolai, Eleanor Chodroff, Frederic Mailhot, Çağrı Çöltekin
Venue:
SIGMORPHON
SIG:
SIGMORPHON
Publisher:
Association for Computational Linguistics
Note:
Pages:
14–26
Language:
URL:
https://aclanthology.org/2023.sigmorphon-1.3
DOI:
10.18653/v1/2023.sigmorphon-1.3
Bibkey:
Cite (ACL):
Siddhesh Pawar, Pushpak Bhattacharyya, and Partha Talukdar. 2023. Evaluating Cross Lingual Transfer for Morphological Analysis: a Case Study of Indian Languages. In Proceedings of the 20th SIGMORPHON workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 14–26, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Evaluating Cross Lingual Transfer for Morphological Analysis: a Case Study of Indian Languages (Pawar et al., SIGMORPHON 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.sigmorphon-1.3.pdf