Building a Universal Dependencies Treebank for Georgian

Irina Lobzhanidze, Erekle Magradze, Svetlana Berikashvili, Anzor Gozalishvili, Tamar Jalaghonia


Abstract
This paper presents the design and development of the Georgian Syntactic Treebank within the Universal Dependencies (UD) framework, addressing the unique morphosyntactic challenges ofGeorgian, a Kartvelian language. We describe the methodology for selecting andannotating 3,013 sentences from Wiki, mapping existing tagsets to the UD scheme, and converting data into the CoNLL-U format. The paper also details the training of a UDPipe model using this preliminary treebank.
Anthology ID:
2024.tlt-1.5
Volume:
Proceedings of the 22nd Workshop on Treebanks and Linguistic Theories (TLT 2024)
Month:
December
Year:
2024
Address:
Hamburg,Germany
Editors:
Daniel Dakota, Sarah Jablotschkin, Sandra Kübler, Heike Zinsmeister
Venues:
TLT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
40–45
Language:
URL:
https://aclanthology.org/2024.tlt-1.5/
DOI:
Bibkey:
Cite (ACL):
Irina Lobzhanidze, Erekle Magradze, Svetlana Berikashvili, Anzor Gozalishvili, and Tamar Jalaghonia. 2024. Building a Universal Dependencies Treebank for Georgian. In Proceedings of the 22nd Workshop on Treebanks and Linguistic Theories (TLT 2024), pages 40–45, Hamburg,Germany. Association for Computational Linguistics.
Cite (Informal):
Building a Universal Dependencies Treebank for Georgian (Lobzhanidze et al., TLT 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.tlt-1.5.pdf