Minimally-Supervised Morphological Segmentation using Adaptor Grammars

Kairit Sirts, Sharon Goldwater


Abstract
This paper explores the use of Adaptor Grammars, a nonparametric Bayesian modelling framework, for minimally supervised morphological segmentation. We compare three training methods: unsupervised training, semi-supervised training, and a novel model selection method. In the model selection method, we train unsupervised Adaptor Grammars using an over-articulated metagrammar, then use a small labelled data set to select which potential morph boundaries identified by the metagrammar should be returned in the final output. We evaluate on five languages and show that semi-supervised training provides a boost over unsupervised training, while the model selection method yields the best average results over all languages and is competitive with state-of-the-art semi-supervised systems. Moreover, this method provides the potential to tune performance according to different evaluation metrics or downstream tasks.
Anthology ID:
Q13-1021
Volume:
Transactions of the Association for Computational Linguistics, Volume 1
Month:
Year:
2013
Address:
Cambridge, MA
Editors:
Dekang Lin, Michael Collins
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
255–266
Language:
URL:
https://aclanthology.org/Q13-1021
DOI:
10.1162/tacl_a_00225
Bibkey:
Cite (ACL):
Kairit Sirts and Sharon Goldwater. 2013. Minimally-Supervised Morphological Segmentation using Adaptor Grammars. Transactions of the Association for Computational Linguistics, 1:255–266.
Cite (Informal):
Minimally-Supervised Morphological Segmentation using Adaptor Grammars (Sirts & Goldwater, TACL 2013)
Copy Citation:
PDF:
https://aclanthology.org/Q13-1021.pdf