Combining Real and Synthetic Speech for ASR Adaptation in Brazilian Portuguese

Daniel R. da Silva; Maria Eduarda S. Borba; Állan C. P. Silva; Maria Carolina S. Barreto; Arthur F. de Morais; Paulo V. dos Santos; Guilherme C. Dutra; Sávio S. T. de Oliveira; Anderson da S. Soares

Combining Real and Synthetic Speech for ASR Adaptation in Brazilian Portuguese

Daniel R. da Silva, Maria Eduarda S. Borba, Állan C. P. Silva, Maria Carolina S. Barreto, Arthur F. de Morais, Paulo V. dos Santos, Guilherme C. Dutra, Sávio S. T. de Oliveira, Anderson da S. Soares

Abstract

Automatic Speech Recognition (ASR) systems require large amounts of annotated speech, which are difficult to obtain in specialized domains. This paper introduces GARAGEM: General Automotive Real and Artificial speech corpus for Garage Environments and Maintenance in brazilian portuguese, a domain specific ASR dataset for Brazilian Portuguese focused on automotive repair, combining real speech collected from online sources with synthetic speech generated from curated technical terminology. A reproducible methodology is proposed, encompassing real data acquisition, domain guided synthetic data generation, dataset consolidation, and ASR model fine-tuning. Experiments conducted with the Whisper, Wav2vec 2.0, and Conformer models show that synthetic data provides improvements when used to complement real recordings. Quantitative and qualitative analyses show reductions in Word Error Rate (WER) and Character Error Rate (CER) and improved recognition of domain specific terms absent from the real training set. The results indicate that domain guided synthetic speech is an effective data augmentation strategy for ASR adaptation in specialized and low resource scenarios.

Anthology ID:: 2026.propor-1.83
Volume:: Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Month:: April
Year:: 2026
Address:: Salvador, Brazil
Editors:: Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
Venue:: PROPOR
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 838–846
Language:
URL:: https://aclanthology.org/2026.propor-1.83/
DOI:
Bibkey:
Cite (ACL):: Daniel R. da Silva, Maria Eduarda S. Borba, Állan C. P. Silva, Maria Carolina S. Barreto, Arthur F. de Morais, Paulo V. dos Santos, Guilherme C. Dutra, Sávio S. T. de Oliveira, and Anderson da S. Soares. 2026. Combining Real and Synthetic Speech for ASR Adaptation in Brazilian Portuguese. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 838–846, Salvador, Brazil. Association for Computational Linguistics.
Cite (Informal):: Combining Real and Synthetic Speech for ASR Adaptation in Brazilian Portuguese (Silva et al., PROPOR 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.propor-1.83.pdf

PDF Cite Search Fix data