NoCs: A Non-Compound-Stable Splitter for German Compounds

Carmen Schacht


Abstract
Compounding—the creation of highly complex lexical items through the combination of existing lexemes—can be considered one of the most efficient communication phenomenons, though the automatic processing of compound structures—especially of multi-constituent compounds—poses significant challenges for natural language processing. Existing tools like compound-split (Tuggener, 2016) perform well on compound head detection but are limited in handling long compounds and distinguishing compounds from non-compounds. This paper introduces NoCs (non-compound-stable splitter), a novel Python-based tool that extends the functionality of compound-split by incorporating recursive splitting, non-compound detection, and integration with state-of-the-art linguistic resources. NoCs employs a custom stack-and-buffer mechanism to traverse and decompose compounds robustly, even in cases involving multiple constituents. A large-scale evaluation using adapted GermaNet data shows that NoCs substantially outperforms compound-split in both non-compound identification and the recursive splitting of three- to five-constituent compounds, demonstrating its utility as a reliable resource for compound analysis in German.
Anthology ID:
2025.ranlp-stud.6
Volume:
Proceedings of the 9th Student Research Workshop associated with the International Conference Recent Advances in Natural Language Processing
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Boris Velichkov, Ivelina Nikolova-Koleva, Milena Slavcheva
Venues:
RANLP | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
44–53
Language:
URL:
https://aclanthology.org/2025.ranlp-stud.6/
DOI:
Bibkey:
Cite (ACL):
Carmen Schacht. 2025. NoCs: A Non-Compound-Stable Splitter for German Compounds. In Proceedings of the 9th Student Research Workshop associated with the International Conference Recent Advances in Natural Language Processing, pages 44–53, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
NoCs: A Non-Compound-Stable Splitter for German Compounds (Schacht, RANLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ranlp-stud.6.pdf