Enhancing Phrase-Based Statistical Machine Translation by Learning Phrase Representations Using Long Short-Term Memory Network

Benyamin Ahmadnia; Bonnie Dorr

doi:10.26615/978-954-452-056-4_004

Enhancing Phrase-Based Statistical Machine Translation by Learning Phrase Representations Using Long Short-Term Memory Network

Abstract

Phrases play a key role in Machine Translation (MT). In this paper, we apply a Long Short-Term Memory (LSTM) model over conventional Phrase-Based Statistical MT (PBSMT). The core idea is to use an LSTM encoder-decoder to score the phrase table generated by the PBSMT decoder. Given a source sequence, the encoder and decoder are jointly trained in order to maximize the conditional probability of a target sequence. Analytically, the performance of a PBSMT system is enhanced by using the conditional probabilities of phrase pairs computed by an LSTM encoder-decoder as an additional feature in the existing log-linear model. We compare the performance of the phrase tables in the PBSMT to the performance of the proposed LSTM and observe its positive impact on translation quality. We construct a PBSMT model using the Moses decoder and enrich the Language Model (LM) utilizing an external dataset. We then rank the phrase tables using an LSTM-based encoder-decoder. This method produces a gain of up to 3.14 BLEU score on the test set.

Anthology ID:: R19-1004
Volume:: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
Month:: September
Year:: 2019
Address:: Varna, Bulgaria
Editors:: Ruslan Mitkov, Galia Angelova
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd.
Note:
Pages:: 25–32
Language:
URL:: https://aclanthology.org/R19-1004/
DOI:: 10.26615/978-954-452-056-4_004
Bibkey:
Cite (ACL):: Benyamin Ahmadnia and Bonnie Dorr. 2019. Enhancing Phrase-Based Statistical Machine Translation by Learning Phrase Representations Using Long Short-Term Memory Network. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 25–32, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):: Enhancing Phrase-Based Statistical Machine Translation by Learning Phrase Representations Using Long Short-Term Memory Network (Ahmadnia & Dorr, RANLP 2019)
Copy Citation:
PDF:: https://aclanthology.org/R19-1004.pdf

PDF Cite Search Fix data