One System, Many Domains: Open-Domain Statistical Machine Translation via Feature Augmentation

Jonathan Clark, Alon Lavie, Chris Dyer


Abstract
In this paper, we introduce a simple technique for incorporating domain information into a statistical machine translation system that significantly improves translation quality when test data comes from multiple domains. Our approach augments (conjoins) standard translation model and language model features with domain indicator features and requires only minimal modifications to the optimization and decoding procedures. We evaluate our method on two language pairs with varying numbers of domains, and observe significant improvements of up to 1.0 BLEU.
Anthology ID:
2012.amta-papers.4
Volume:
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers
Month:
October 28-November 1
Year:
2012
Address:
San Diego, California, USA
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
Language:
URL:
https://aclanthology.org/2012.amta-papers.4
DOI:
Bibkey:
Cite (ACL):
Jonathan Clark, Alon Lavie, and Chris Dyer. 2012. One System, Many Domains: Open-Domain Statistical Machine Translation via Feature Augmentation. In Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers, San Diego, California, USA. Association for Machine Translation in the Americas.
Cite (Informal):
One System, Many Domains: Open-Domain Statistical Machine Translation via Feature Augmentation (Clark et al., AMTA 2012)
Copy Citation:
PDF:
https://aclanthology.org/2012.amta-papers.4.pdf