An Unsupervised Method for Uncovering Morphological Chains

Karthik Narasimhan, Regina Barzilay, Tommi Jaakkola


Abstract
Most state-of-the-art systems today produce morphological analysis based only on orthographic patterns. In contrast, we propose a model for unsupervised morphological analysis that integrates orthographic and semantic views of words. We model word formation in terms of morphological chains, from base words to the observed words, breaking the chains into parent-child relations. We use log-linear models with morpheme and word-level features to predict possible parents, including their modifications, for each word. The limited set of candidate parents for each word render contrastive estimation feasible. Our model consistently matches or outperforms five state-of-the-art systems on Arabic, English and Turkish.
Anthology ID:
Q15-1012
Volume:
Transactions of the Association for Computational Linguistics, Volume 3
Month:
Year:
2015
Address:
Cambridge, MA
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
157–167
Language:
URL:
https://aclanthology.org/Q15-1012
DOI:
10.1162/tacl_a_00130
Bibkey:
Cite (ACL):
Karthik Narasimhan, Regina Barzilay, and Tommi Jaakkola. 2015. An Unsupervised Method for Uncovering Morphological Chains. Transactions of the Association for Computational Linguistics, 3:157–167.
Cite (Informal):
An Unsupervised Method for Uncovering Morphological Chains (Narasimhan et al., TACL 2015)
Copy Citation:
PDF:
https://aclanthology.org/Q15-1012.pdf
Code
 karthikncode/MorphoChain