Compound or Term Features? Analyzing Salience in Predicting the Difficulty of German Noun Compounds across Domains

Anna Hätty, Julia Bettinger, Michael Dorna, Jonas Kuhn, Sabine Schulte im Walde


Abstract
Predicting the difficulty of domain-specific vocabulary is an important task towards a better understanding of a domain, and to enhance the communication between lay people and experts. We investigate German closed noun compounds and focus on the interaction of compound-based lexical features (such as frequency and productivity) and terminology-based features (contrasting domain-specific and general language) across word representations and classifiers. Our prediction experiments complement insights from classification using (a) manually designed features to characterise termhood and compound formation and (b) compound and constituent word embeddings. We find that for a broad binary distinction into ‘easy’ vs. ‘difficult’ general-language compound frequency is sufficient, but for a more fine-grained four-class distinction it is crucial to include contrastive termhood features and compound and constituent features.
Anthology ID:
2021.starsem-1.24
Volume:
Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics
Month:
August
Year:
2021
Address:
Online
Editors:
Lun-Wei Ku, Vivi Nastase, Ivan Vulić
Venue:
*SEM
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
252–262
Language:
URL:
https://aclanthology.org/2021.starsem-1.24
DOI:
10.18653/v1/2021.starsem-1.24
Bibkey:
Cite (ACL):
Anna Hätty, Julia Bettinger, Michael Dorna, Jonas Kuhn, and Sabine Schulte im Walde. 2021. Compound or Term Features? Analyzing Salience in Predicting the Difficulty of German Noun Compounds across Domains. In Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics, pages 252–262, Online. Association for Computational Linguistics.
Cite (Informal):
Compound or Term Features? Analyzing Salience in Predicting the Difficulty of German Noun Compounds across Domains (Hätty et al., *SEM 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.starsem-1.24.pdf