2024
pdf
bib
A Generalized Algorithm for Learning Positive and Negative Grammars with Unconventional String Models
Sarah Payne
Proceedings of the Society for Computation in Linguistics 2024
2023
pdf
bib
abs
A Cautious Generalization Goes a Long Way: Learning Morphophonological Rules
Salam Khalifa
|
Sarah Payne
|
Jordan Kodner
|
Ellen Broselow
|
Owen Rambow
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Explicit linguistic knowledge, encoded by resources such as rule-based morphological analyzers, continues to prove useful in downstream NLP tasks, especially for low-resource languages and dialects. Rules are an important asset in descriptive linguistic grammars. However, creating such resources is usually expensive and non-trivial, especially for spoken varieties with no written standard. In this work, we present a novel approach for automatically learning morphophonological rules of Arabic from a corpus. Motivated by classic cognitive models for rule learning, rules are generalized cautiously. Rules that are memorized for individual items are only allowed to generalize to unseen forms if they are sufficiently reliable in the training data. The learned rules are further examined to ensure that they capture true linguistic phenomena described by domain experts. We also investigate the learnability of rules in low-resource settings across different experimental setups and dialects.
pdf
bib
abs
Morphological Inflection: A Reality Check
Jordan Kodner
|
Sarah Payne
|
Salam Khalifa
|
Zoey Liu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Morphological inflection is a popular task in sub-word NLP with both practical and cognitive applications. For years now, state-of-the-art systems have reported high, but also highly variable, performance across data sets and languages. We investigate the causes of this high performance and high variability; we find several aspects of data set creation and evaluation which systematically inflate performance and obfuscate differences between languages. To improve generalizability and reliability of results, we propose new data sampling and evaluation strategies that better reflect likely use-cases. Using these new strategies, we make new observations on the generalization abilities of current inflection systems.
2021
pdf
bib
Learning Morphological Productivity as Meaning-Form Mappings
Sarah Payne
|
Jordan Kodner
|
Charles Yang
Proceedings of the Society for Computation in Linguistics 2021