Amar Mahdhaoui
2007
The LIG Arabic/English speech translation system at IWSLT07
Laurent Besacier
|
Amar Mahdhaoui
|
Viet-Bac Le
Proceedings of the Fourth International Workshop on Spoken Language Translation
This paper is a description of the system presented by the LIG laboratory to the IWSLT07 speech translation evaluation. The LIG participated, for the first time this year, in the Arabic to English speech translation task. For translation, we used a conventional statistical phrase-based system developed using the moses open source decoder. Our baseline MT system is described and we discuss particularly the use of an additional bilingual dictionary which seems useful when few training data is available. The main contribution of this paper concerns the proposal of a lattice decomposition algorithm that allows transforming a word lattice into a sub word lattice compatible with our MT model that uses word segmentation on the Arabic part. The lattice is then transformed into a confusion network which can be directly decoded into moses. The results show that this method outperforms the conventional 1-best translation which consists in translating only the most probable ASR hypothesis. The best BLEU score, from ASR output obtained on IWSLT06 evaluation data is 0.2253. The results confirm the interest of full CN decoding for speech translation, compared to traditional ASR 1-best approach. Our primary system was ranked 7/14 for IWSLT07 AE ASR task with a BLEU score of 0.3804.