Targeted Syntactic Evaluation of Language Models on Georgian Case Alignment

Daniel Gallagher; Gerhard Heyer

Targeted Syntactic Evaluation of Language Models on Georgian Case Alignment

Abstract

This paper evaluates the performance of transformer-based language models on split-ergative case alignment in Georgian, a particularly rare system for assigning grammatical cases to mark argument roles. We focus on subject and object marking determined through various permutations of nominative, ergative, and dative noun forms. A treebank-based approach for the generation of minimal pairs using the Grew query language is implemented. We create a dataset of 370 syntactic tests made up of seven tasks containing 50-70 samples each, where three noun forms are tested in any given sample. Five encoder- and two decoder-only models are evaluated with word- and/or sentence-level accuracy metrics. Regardless of the specific syntactic makeup, models performed worst in assigning the ergative case correctly and strongest in assigning the nominative case correctly. Performance correlated with the overall frequency distribution of the three forms (NOM > DAT > ERG). Though data scarcity is a known issue for low-resource languages, we show that the highly specific role of the ergative along with a lack of available training data likely contributes to poor performance on this case. The dataset is made publicly available and the methodology provides an interesting avenue for future syntactic evaluations of languages where benchmarks are limited.

Anthology ID:: 2026.loreslm-1.23
Volume:: Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Hansi Hettiarachchi, Tharindu Ranasinghe, Alistair Plum, Paul Rayson, Ruslan Mitkov, Mohamed Gaber, Damith Premasiri, Fiona Anting Tan, Lasitha Uyangodage
Venue:: LoResLM
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 259–270
Language:
URL:: https://aclanthology.org/2026.loreslm-1.23/
DOI:
Bibkey:
Cite (ACL):: Daniel Gallagher and Gerhard Heyer. 2026. Targeted Syntactic Evaluation of Language Models on Georgian Case Alignment. In Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026), pages 259–270, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Targeted Syntactic Evaluation of Language Models on Georgian Case Alignment (Gallagher & Heyer, LoResLM 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.loreslm-1.23.pdf

PDF Cite Search Fix data