UD for German Poetry

Stefanie Dipper, Ronja Laarmann-Quante


Abstract
This article deals with the syntactic analysis of German-language poetry from different centuries. We use Universal Dependencies (UD) as our syntactic framework. We discuss particular challenges of the poems in terms of tokenization, sentence boundary recognition and special syntactic constructions. Our annotated corpus currently consists of 20 poems with a total of 2,162 tokens, which originate from the PoeTree.de corpus. We present some statistics on our annotations and also evaluate the automatic UD annotation from PoeTree.de using our annotations.
Anthology ID:
2024.nlp4dh-1.17
Volume:
Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities
Month:
November
Year:
2024
Address:
Miami, USA
Editors:
Mika Hämäläinen, Emily Öhman, So Miyagawa, Khalid Alnajjar, Yuri Bizzoni
Venue:
NLP4DH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
177–188
Language:
URL:
https://aclanthology.org/2024.nlp4dh-1.17
DOI:
Bibkey:
Cite (ACL):
Stefanie Dipper and Ronja Laarmann-Quante. 2024. UD for German Poetry. In Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities, pages 177–188, Miami, USA. Association for Computational Linguistics.
Cite (Informal):
UD for German Poetry (Dipper & Laarmann-Quante, NLP4DH 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.nlp4dh-1.17.pdf