Talha Bedir
2021
Overcoming the challenges in morphological annotation of Turkish in universal dependencies framework
Talha Bedir
|
Karahan Şahin
|
Onur Gungor
|
Suzan Uskudarli
|
Arzucan Özgür
|
Tunga Güngör
|
Balkiz Ozturk Basaran
Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop
This paper presents several challenges faced when annotating Turkish treebanks in accordance with the Universal Dependencies (UD) guidelines and proposes solutions to address them. Most of these challenges stem from the lack of adequate support in the UD framework to accurately represent null morphemes and complex derivations, which results in a significant loss of information for Turkish. This loss negatively impacts the tools that are developed based on these treebanks. We raised and discussed these issues within the community on the official UD portal. This paper presents these issues and our proposals to more accurately represent morphosyntactic information for Turkish while adhering to guidelines of UD. This work aims to contribute to the representation of Turkish and other agglutinative languages in UD-based treebanks, which in turn aids to develop more accurately annotated datasets for such languages.
Search
Co-authors
- Karahan Şahin 1
- Onur Güngör 1
- Suzan Uskudarli 1
- Arzucan Özgür 1
- Tunga Güngör 1
- show all...
Venues
- law1