A tunable language model for statistical machine translation

Junfei Guo, Juan Liu, Qi Han, Andreas Maletti


Abstract
A novel variation of modified KNESER-NEY model using monomial discounting is presented and integrated into the MOSES statistical machine translation toolkit. The language model is trained on a large training set as usual, but its new discount parameters are tuned to the small development set. An in-domain and cross-domain evaluation of the language model is performed based on perplexity, in which sizable improvements are obtained. Additionally, the performance of the language model is also evaluated in several major machine translation tasks including Chinese-to-English. In those tests, the test data is from a (slightly) different domain than the training data. The experimental results indicate that the new model significantly outperforms a baseline model using SRILM in those domain adaptation scenarios. The new language model is thus ideally suited for domain adaptation without sacrificing performance on in-domain experiments.
Anthology ID:
2014.amta-researchers.27
Volume:
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track
Month:
October 22-26
Year:
2014
Address:
Vancouver, Canada
Editors:
Yaser Al-Onaizan, Michel Simard
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
356–368
Language:
URL:
https://aclanthology.org/2014.amta-researchers.27
DOI:
Bibkey:
Cite (ACL):
Junfei Guo, Juan Liu, Qi Han, and Andreas Maletti. 2014. A tunable language model for statistical machine translation. In Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track, pages 356–368, Vancouver, Canada. Association for Machine Translation in the Americas.
Cite (Informal):
A tunable language model for statistical machine translation (Guo et al., AMTA 2014)
Copy Citation:
PDF:
https://aclanthology.org/2014.amta-researchers.27.pdf