Ekaterina Harke
2022
Speech-to-Text and Evaluation of Multiple Machine Translation Systems
Evelyne Tzoukermann
|
Steven Van Guilder
|
Jennifer Doyon
|
Ekaterina Harke
Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track)
The National Virtual Translation Center (NVTC) and the larger Federal Bureau of Investiga-tion (FBI) seek to acquire tools that will facilitate its mission to provide English translations of non-English language audio and video files. In the text domain, NVTC has been using translation memory (TM) for some time and has reported on the incorporation of machine translation (MT) into that workflow. While we have explored the use of speech-to-text (STT) and speech translation (ST) in the past, we have now invested in the creation of a substantial human-created corpus to thoroughly evaluate alternatives in three languages: French, Rus-sian, and Persian. We report on the results of multiple STT systems combined with four MT systems for these languages. We evaluated and scored the different systems in combination and analyzed results. This points the way to the most successful tool combination to deploy in this workflow.