Evaluating End-to-End Speech-to-Speech Translation for Dubbing: Challenges and New Metrics

Fred Bane

Evaluating End-to-End Speech-to-Speech Translation for Dubbing: Challenges and New Metrics

Abstract

The advent of end-to-end speech-to-speech translation (S2ST) systems in recent years marks a significant advancement over traditional cascaded approaches. These novel systems represent a direct translation pathway from spoken input to spoken output without relying on intermediate text forms. However, evaluation methods for this task, such as ASR BLEU, are often still compartmentalized and text-based. We suggest the quality of the resulting speech must be measured too. Naturalness, similarity of the target voice to the original, reflection of accents, and rhythm are all important. We argue that new evaluation metrics are needed in response to this watershed change. Our presentation approaches this topic through the lens of dubbing, with a particular focus on voice over. We begin with a critical examination of existing metrics. Then we discuss key features of S2ST that are inadequately captured. Finally, we propose new directions for evaluation of S2ST systems.

Anthology ID:: 2024.amta-presentations.13
Volume:: Proceedings of the 16th Conference of the Association for Machine Translation in the Americas (Volume 2: Presentations)
Month:: September
Year:: 2024
Address:: Chicago, USA
Editors:: Marianna Martindale, Janice Campbell, Konstantin Savenkov, Shivali Goel
Venue:: AMTA
SIG:
Publisher:: Association for Machine Translation in the Americas
Note:
Pages:: 184–207
Language:
URL:: https://aclanthology.org/2024.amta-presentations.13/
DOI:
Bibkey:
Cite (ACL):: Fred Bane. 2024. Evaluating End-to-End Speech-to-Speech Translation for Dubbing: Challenges and New Metrics. In Proceedings of the 16th Conference of the Association for Machine Translation in the Americas (Volume 2: Presentations), pages 184–207, Chicago, USA. Association for Machine Translation in the Americas.
Cite (Informal):: Evaluating End-to-End Speech-to-Speech Translation for Dubbing: Challenges and New Metrics (Bane, AMTA 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.amta-presentations.13.pdf

PDF Cite Search Fix data