Integrating Encyclopedic Knowledge into Neural Language Models

Yang Zhang, Jan Niehues, Alexander Waibel


Abstract
Neural models have recently shown big improvements in the performance of phrase-based machine translation. Recurrent language models, in particular, have been a great success due to their ability to model arbitrary long context. In this work, we integrate global semantic information extracted from large encyclopedic sources into neural network language models. We integrate semantic word classes extracted from Wikipedia and sentence level topic information into a recurrent neural network-based language model. The new resulting models exhibit great potential in alleviating data sparsity problems with the additional knowledge provided. This approach of integrating global information is not restricted to language modeling but can also be easily applied to any model that profits from context or further data resources, e.g. neural machine translation. Using this model has improved rescoring quality of a state-of-the-art phrase-based translation system by 0.84 BLEU points. We performed experiments on two language pairs.
Anthology ID:
2016.iwslt-1.2
Volume:
Proceedings of the 13th International Conference on Spoken Language Translation
Month:
December 8-9
Year:
2016
Address:
Seattle, Washington D.C
Editors:
Mauro Cettolo, Jan Niehues, Sebastian Stüker, Luisa Bentivogli, Rolando Cattoni, Marcello Federico
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
International Workshop on Spoken Language Translation
Note:
Pages:
Language:
URL:
https://aclanthology.org/2016.iwslt-1.2
DOI:
Bibkey:
Cite (ACL):
Yang Zhang, Jan Niehues, and Alexander Waibel. 2016. Integrating Encyclopedic Knowledge into Neural Language Models. In Proceedings of the 13th International Conference on Spoken Language Translation, Seattle, Washington D.C. International Workshop on Spoken Language Translation.
Cite (Informal):
Integrating Encyclopedic Knowledge into Neural Language Models (Zhang et al., IWSLT 2016)
Copy Citation:
PDF:
https://aclanthology.org/2016.iwslt-1.2.pdf