Improving Phrase-Based Statistical Machine Translation with Morpho-Syntactic Analysis and Transformation

Thai Phuong Nguyen, Akira Shimazu


Abstract
This paper presents our study of exploiting morpho-syntactic information for phrase-based statistical machine translation (SMT). For morphological transformation, we use hand-crafted transformational rules. For syntactic transformation, we propose a transformational model based on Bayes’ formula. The model is trained using a bilingual corpus and a broad coverage parser of the source language. The morphological and syntactic transformations are used in the preprocessing phase of a SMT system. This preprocessing method is applicable to language pairs in which the target language is poor in resources. We applied the proposed method to translation from English to Vietnamese. Our experiments showed a BLEU-score improvement of more than 3.28% in comparison with Pharaoh, a state-of-the-art phrase-based SMT system.
Anthology ID:
2006.amta-papers.16
Volume:
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers
Month:
August 8-12
Year:
2006
Address:
Cambridge, Massachusetts, USA
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
138–147
Language:
URL:
https://aclanthology.org/2006.amta-papers.16
DOI:
Bibkey:
Cite (ACL):
Thai Phuong Nguyen and Akira Shimazu. 2006. Improving Phrase-Based Statistical Machine Translation with Morpho-Syntactic Analysis and Transformation. In Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, pages 138–147, Cambridge, Massachusetts, USA. Association for Machine Translation in the Americas.
Cite (Informal):
Improving Phrase-Based Statistical Machine Translation with Morpho-Syntactic Analysis and Transformation (Nguyen & Shimazu, AMTA 2006)
Copy Citation:
PDF:
https://aclanthology.org/2006.amta-papers.16.pdf