Overview of the IWSLT 2012 evaluation campaign
M. Federico | M. Cettolo | L. Bentivogli | M. Paul | S. Stüker
Proceedings of the 9th International Workshop on Spoken Language Translation: Evaluation Campaign
We report on the ninth evaluation campaign organized by the IWSLT workshop. This year, the evaluation offered multiple tracks on lecture translation based on the TED corpus, and one track on dialog translation from Chinese to English based on the Olympic trilingual corpus. In particular, the TED tracks included a speech transcription track in English, a speech translation track from English to French, and text translation tracks from English to French and from Arabic to English. In addition to the official tracks, ten unofficial MT tracks were offered that required translating TED talks into English from either Chinese, Dutch, German, Polish, Portuguese (Brazilian), Romanian, Russian, Slovak, Slovene, or Turkish. 16 teams participated in the evaluation and submitted a total of 48 primary runs. All runs were evaluated with objective metrics, while runs of the official translation tracks were also ranked by crowd-sourced judges. In particular, subjective ranking for the TED task was performed on a progress test which permitted direct comparison of the results from this year against the best results from the 2011 round of the evaluation campaign.
This paper presents a look inside the ITC-irst large-vocabulary SMT system developed for the NIST 2005 Chinese-to-English evaluation campaign. Experiments on official NIST test sets provide a thorough overview of the performance of the system, supplying information on how single components contribute to the global performance. The presented system exhibits performance comparable to that of the best systems participating in the NIST 2002-2004 MT evaluation campaigns: on the three test sets, achieved BLEU scores are 26.35%, 26.92% and 28.13%, respectively.