SwissADT: An Audio Description Translation System for Swiss Languages

Lukas Fischer; Yingqiang Gao; Alexa Lintner; Annette Rios Gonzales; Sarah Ebling

doi:10.18653/v1/2025.naacl-industry.31

SwissADT: An Audio Description Translation System for Swiss Languages

Lukas Fischer, Yingqiang Gao, Alexa Lintner, Annette Rios, Sarah Ebling

Abstract

Audio description (AD) is a crucial accessibility service provided to blind persons and persons with visual impairment, designed to convey visual information in acoustic form. Despite recent advancements in multilingual machine translation research, the lack of well-crafted and time-synchronized AD data impedes the development of audio description translation (ADT) systems that address the needs of multilingual countries such as Switzerland. Furthermore, most ADT systems rely on text alone, and it is unclear whether incorporating visual information from video clips improves the quality of ADT outputs.In this work, we introduce SwissADT, an **emerging** ADT system for three main Swiss languages and English, designed for future use by our industry partners. By collecting well-crafted AD data augmented with video clips in German, French, Italian, and English, and leveraging the power of Large Language Models (LLMs), we aim to enhance information accessibility for diverse language populations in Switzerland by automatically translating AD scripts to the desired Swiss language. Our extensive experimental ADT results, composed of both automatic and human evaluations of ADT quality, demonstrate the promising capability of SwissADT for the ADT task. We believe that combining human expertise with the generation power of LLMs can further enhance the performance of ADT systems, ultimately benefiting a larger multilingual target population.

Anthology ID:: 2025.naacl-industry.31
Volume:: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Weizhu Chen, Yi Yang, Mohammad Kachuee, Xue-Yong Fu
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 370–379
Language:
URL:: https://aclanthology.org/2025.naacl-industry.31/
DOI:: 10.18653/v1/2025.naacl-industry.31
Bibkey:
Cite (ACL):: Lukas Fischer, Yingqiang Gao, Alexa Lintner, Annette Rios, and Sarah Ebling. 2025. SwissADT: An Audio Description Translation System for Swiss Languages. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track), pages 370–379, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: SwissADT: An Audio Description Translation System for Swiss Languages (Fischer et al., NAACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.naacl-industry.31.pdf

PDF Cite Search Fix data