The University of Maryland translation system for IWSLT 2007

Christopher J. Dyer


Abstract
This paper describes the University of Maryland statistical machine translation system used in the IWSLT 2007 evaluation. Our focus was threefold: using hierarchical phrase-based models in spoken language translation, the incorporation of sub-lexical information in model estimation via morphological analysis (Arabic) and word and character segmentation (Chinese), and the use of n-gram sequence models for source-side punctuation prediction. Our efforts yield significant improvements in Chinese-English and Arabic-English translation tasks for both spoken language and human transcription conditions.
Anthology ID:
2007.iwslt-1.28
Volume:
Proceedings of the Fourth International Workshop on Spoken Language Translation
Month:
October 15-16
Year:
2007
Address:
Trento, Italy
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
Language:
URL:
https://aclanthology.org/2007.iwslt-1.28
DOI:
Bibkey:
Cite (ACL):
Christopher J. Dyer. 2007. The University of Maryland translation system for IWSLT 2007. In Proceedings of the Fourth International Workshop on Spoken Language Translation, Trento, Italy.
Cite (Informal):
The University of Maryland translation system for IWSLT 2007 (Dyer, IWSLT 2007)
Copy Citation:
PDF:
https://aclanthology.org/2007.iwslt-1.28.pdf