Nico Urbach
2026
Onomasiological Sense Alignment Across Dialect Dictionaries. A Taxonomy-Constrained LLM Classification
Nathalie Mederake | Nico Urbach | Hanna Fischer | Alfred Lameli
Proceedings of the 13th Workshop on NLP for Similar Languages, Varieties and Dialects
Nathalie Mederake | Nico Urbach | Hanna Fischer | Alfred Lameli
Proceedings of the 13th Workshop on NLP for Similar Languages, Varieties and Dialects
We propose a taxonomy-guided approach to semantic alignment that assigns lexicographic senses to an onomasiological taxonomy derived from the Hallig–Wartburg/Post system. Using an LLM under strict taxonomic constraints, short and heterogeneous meaning descriptions are assigned to a common conceptual space. Evaluation against expert annotation shows that run-to-run model agreement (kappa = 0.73) closely matches human agreement (kappa = 0.74), with robustness at coarse taxonomic levels and predictable degradation at finer granularity. A qualitative network analysis demonstrates the resulting potential for cross-dictionary exploration of dialectal variation in semantics.