Online language model adaptation for spoken dialog translation

Germán Sanchis-Trilles, Mauro Cettolo, Nicola Bertoldi, Marcello Federico


Abstract
This paper focuses on the problem of language model adaptation in the context of Chinese-English cross-lingual dialogs, as set-up by the challenge task of the IWSLT 2009 Evaluation Campaign. Mixtures of n-gram language models are investigated, which are obtained by clustering bilingual training data according to different available human annotations, respectively, at the dialog level, turn level, and dialog act level. For the latter case, clustering of IWSLT data was in fact induced through a comparable Italian-English parallel corpus provided with dialog act annotations. For the sake of adaptation, mixture weight estimation is performed either at the level of single source sentence or test set. Estimated weights are then transferred to the target language mixture model. Experimental results show that, by training different specific language models weighted according to the actual input instead of using a single target language model, significant gains in terms of perplexity and BLEU can be achieved.
Anthology ID:
2009.iwslt-papers.5
Volume:
Proceedings of the 6th International Workshop on Spoken Language Translation: Papers
Month:
December 1-2
Year:
2009
Address:
Tokyo, Japan
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
160–167
Language:
URL:
https://aclanthology.org/2009.iwslt-papers.5
DOI:
Bibkey:
Cite (ACL):
Germán Sanchis-Trilles, Mauro Cettolo, Nicola Bertoldi, and Marcello Federico. 2009. Online language model adaptation for spoken dialog translation. In Proceedings of the 6th International Workshop on Spoken Language Translation: Papers, pages 160–167, Tokyo, Japan.
Cite (Informal):
Online language model adaptation for spoken dialog translation (Sanchis-Trilles et al., IWSLT 2009)
Copy Citation:
PDF:
https://aclanthology.org/2009.iwslt-papers.5.pdf
Presentation:
 2009.iwslt-papers.5.Presentation.pdf