Language-Agnostic Measures Discriminate Inflection and Derivation

Coleman Haley, Edoardo M. Ponti, Sharon Goldwater


Abstract
In morphology, a distinction is commonly drawn between inflection and derivation. However, a precise definition of this distinction which captures the way the terms are used across languages remains elusive within linguistic theory, typically being based on subjective tests. In this study, we present 4 quantitative measures which use the statistics of a raw text corpus in a language to estimate how much and how variably a morphological construction changes aspects of the lexical entry, specifically, the word’s form and the word’s semantic and syntactic properties (as operationalised by distributional word embeddings). Based on a sample of 26 languages, we find that we can reconstruct 90% of the classification of constructions into inflection and derivation in Unimorph using our 4 measures, providing large-scale cross-linguistic evidence that the concepts of inflection and derivation are associated with measurable signatures in terms of form and distribution signatures that behave consistently across a variety of languages. Critically, our measures and models are entirely language-agnostic, yet perform well across all languages studied. We find that while there is a high degree of consistency in the use of the terms inflection and derivation in terms of our measures, there are still many constructions near the model’s decision boundary between the two categories, indicating a gradient, rather than categorical, distinction.
Anthology ID:
2023.sigtyp-1.18
Volume:
Proceedings of the 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Lisa Beinborn, Koustava Goswami, Saliha Muradoğlu, Alexey Sorokin, Ritesh Kumar, Andreas Shcherbakov, Edoardo M. Ponti, Ryan Cotterell, Ekaterina Vylomova
Venue:
SIGTYP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
150–152
Language:
URL:
https://aclanthology.org/2023.sigtyp-1.18
DOI:
10.18653/v1/2023.sigtyp-1.18
Bibkey:
Cite (ACL):
Coleman Haley, Edoardo M. Ponti, and Sharon Goldwater. 2023. Language-Agnostic Measures Discriminate Inflection and Derivation. In Proceedings of the 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, pages 150–152, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Language-Agnostic Measures Discriminate Inflection and Derivation (Haley et al., SIGTYP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.sigtyp-1.18.pdf
Video:
 https://aclanthology.org/2023.sigtyp-1.18.mp4