Reut Tsarfay


2025

pdf bib
Findings of the UniDive 2025 shared task on multilingual Morpho-Syntactic Parsing
Omer Goldman | Leonie Weissweiler | Kutay Acar | Diego Alves | Anna Baczkowska | Gulsen Eryigit | Lenka Krippnerová | Adriana Pagano | Tanja Samardžić | Luigi Talamo | Alina Wróblewska | Daniel Zeman | Joakim Nivre | Reut Tsarfay
Proceedings of The UniDive 2025 Shared Task on Multilingual Morpho-Syntactic Parsing

This paper details the findings of the 2025 UniDive shared task on multilingual morphosyntactic parsing. It introduces a new representation in which morphology and syntax are modelled jointly to form dependency trees of contentful elements, each characterized by features determined by grammatical words and morphemes. This schema allows bypassing the theoretical debate over the definition of “words” and it encourages development of parsers for typologically diverse languages. The data for the task, spanning 9 languages, was annotated based on existing Universal Dependencies (UD) treebanks that were adapted to the new format. We accompany the data with a new metric, MSLAS, that combines syntactic LAS with F1 over grammatical features. The task received two submissions, which together with three baselines give a detailed view on the ability of multi-task encoder models to cope with the task at hand. The best performing system, UM, achieved 78.7 MSLAS macro-averaged over all languages, improving by 31.4 points over the few-shot prompting baseline.