Investigation of the effects of ASR tuning on speech translation performance

Paul R. Dixon, Andrew Finch, Chiori Hori, Hideki Kashioka


Abstract
In this paper we describe some of our recent investigations into ASR and SMT coupling issues from an ASR perspective. Our study was motivated by several areas: Firstly, to understand how standard ASR tuning procedures effect the SMT performance and whether it is safe to perform this tuning in isolation. Secondly, to investigate how vocabulary and segmentation mismatches between the ASR and SMT system effect the performance. Thirdly, to uncover any practical issues that arise when using a WFST based speech decoder for tight coupling as opposed to a more traditional tree-search decoding architecture. On the IWSLT07 Japanese-English task we found that larger language model weights only helped the SMT performance when the ASR decoder was tuned in a sub-optimal manner. When we considered the performance with suitable wide beams that ensured the ASR accuracy had converged we observed the language model weight had little influence on the SMT BLEU scores. After the construction of the phrase table the actual SMT vocabulary can be less than the training data vocabulary. By reducing the ASR lexicon to only cover the words the SMT system could accept, we found this lead to an increase in the ASR error rates, however the SMT BLEU scores were nearly unchanged. From a practical point of view this is a useful result as it means we can significantly reduce the memory footprint of the ASR system. We also investigated coupling WFST based ASR to a simple WFST based translation decoder and found it was crucial to perform phrase table expansion to avoid OOV problems. For the WFST translation decoder we describe a semiring based approach for optimizing the log-linear weights.
Anthology ID:
2011.iwslt-evaluation.22
Volume:
Proceedings of the 8th International Workshop on Spoken Language Translation: Evaluation Campaign
Month:
December 8-9
Year:
2011
Address:
San Francisco, California
Editors:
Marcello Federico, Mei-Yuh Hwang, Margit Rödder, Sebastian Stüker
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
167–174
Language:
URL:
https://aclanthology.org/2011.iwslt-evaluation.22
DOI:
Bibkey:
Cite (ACL):
Paul R. Dixon, Andrew Finch, Chiori Hori, and Hideki Kashioka. 2011. Investigation of the effects of ASR tuning on speech translation performance. In Proceedings of the 8th International Workshop on Spoken Language Translation: Evaluation Campaign, pages 167–174, San Francisco, California.
Cite (Informal):
Investigation of the effects of ASR tuning on speech translation performance (Dixon et al., IWSLT 2011)
Copy Citation:
PDF:
https://aclanthology.org/2011.iwslt-evaluation.22.pdf