Controlled Evaluation of Syntactic Knowledge in Multilingual Language Models

Daria Kryvosheieva, Roger Levy


Abstract
Language models (LMs) are capable of acquiring elements of human-like syntactic knowledge. Targeted syntactic evaluation tests have been employed to measure how well they form generalizations about syntactic phenomena in high-resource languages such as English. However, we still lack a thorough understanding of LMs’ capacity for syntactic generalizations in low-resource languages, which are responsible for much of the diversity of syntactic patterns worldwide. In this study, we develop targeted syntactic evaluation tests for three low-resource languages (Basque, Hindi, and Swahili) and use them to evaluate five families of open-access multilingual Transformer LMs. We find that some syntactic tasks prove relatively easy for LMs while others (agreement in sentences containing indirect objects in Basque, agreement across a prepositional phrase in Swahili) are challenging. We additionally uncover issues with publicly available Transformers, including a bias toward the habitual aspect in Hindi in multilingual BERT and underperformance compared to similar-sized models in XGLM-4.5B.
Anthology ID:
2025.loreslm-1.30
Volume:
Proceedings of the First Workshop on Language Models for Low-Resource Languages
Month:
January
Year:
2025
Address:
Abu Dhabi, United Arab Emirates
Editors:
Hansi Hettiarachchi, Tharindu Ranasinghe, Paul Rayson, Ruslan Mitkov, Mohamed Gaber, Damith Premasiri, Fiona Anting Tan, Lasitha Uyangodage
Venues:
LoResLM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
402–413
Language:
URL:
https://aclanthology.org/2025.loreslm-1.30/
DOI:
Bibkey:
Cite (ACL):
Daria Kryvosheieva and Roger Levy. 2025. Controlled Evaluation of Syntactic Knowledge in Multilingual Language Models. In Proceedings of the First Workshop on Language Models for Low-Resource Languages, pages 402–413, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Controlled Evaluation of Syntactic Knowledge in Multilingual Language Models (Kryvosheieva & Levy, LoResLM 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.loreslm-1.30.pdf
Optionalsupplementarymaterial:
 2025.loreslm-1.30.OptionalSupplementaryMaterial.zip