Better Learning and Decoding for Syntax Based SMT Using PSDIG

Yuan Ding, Martha Palmer


Abstract
As an approach to syntax based statistical machine translation (SMT), Probabilistic Synchronous Dependency Insertion Grammars (PSDIG), introduced in (Ding and Palmer, 2005), are a version of synchronous grammars defined on dependency trees. In this paper we discuss better learning and decoding algorithms for a PSDIG MT system. We introduce two new grammar learners: (1) an exhaustive learner combining different heuristics, (2) an n-gram based grammar learner. Combining the grammar rules learned from the two learners improved the performance. We introduce a better decoding algorithm which incorporates a tri-gram language model. According to the Bleu metric, the PSDIG MT system performance is significantly better than IBM Model 4, while on par with the state-of-the-art phrase based system Pharaoh (Koehn, 2004). The improved integration of syntax on both source and target languages opens door to more sophisticated SMT processes.
Anthology ID:
2006.amta-papers.5
Volume:
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers
Month:
August 8-12
Year:
2006
Address:
Cambridge, Massachusetts, USA
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
37–45
Language:
URL:
https://aclanthology.org/2006.amta-papers.5
DOI:
Bibkey:
Cite (ACL):
Yuan Ding and Martha Palmer. 2006. Better Learning and Decoding for Syntax Based SMT Using PSDIG. In Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, pages 37–45, Cambridge, Massachusetts, USA. Association for Machine Translation in the Americas.
Cite (Informal):
Better Learning and Decoding for Syntax Based SMT Using PSDIG (Ding & Palmer, AMTA 2006)
Copy Citation:
PDF:
https://aclanthology.org/2006.amta-papers.5.pdf