Daniel R. da Silva

2026

Combining Real and Synthetic Speech for ASR Adaptation in Brazilian Portuguese
Daniel R. da Silva | Maria Eduarda S. Borba | Állan C. P. Silva | Maria Carolina S. Barreto | Arthur F. de Morais | Paulo V. dos Santos | Guilherme C. Dutra | Sávio S. T. de Oliveira | Anderson da S. Soares
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1

Automatic Speech Recognition (ASR) systems require large amounts of annotated speech, which are difficult to obtain in specialized domains. This paper introduces GARAGEM: General Automotive Real and Artificial speech corpus for Garage Environments and Maintenance in brazilian portuguese, a domain specific ASR dataset for Brazilian Portuguese focused on automotive repair, combining real speech collected from online sources with synthetic speech generated from curated technical terminology. A reproducible methodology is proposed, encompassing real data acquisition, domain guided synthetic data generation, dataset consolidation, and ASR model fine-tuning. Experiments conducted with the Whisper, Wav2vec 2.0, and Conformer models show that synthetic data provides improvements when used to complement real recordings. Quantitative and qualitative analyses show reductions in Word Error Rate (WER) and Character Error Rate (CER) and improved recognition of domain specific terms absent from the real training set. The results indicate that domain guided synthetic speech is an effective data augmentation strategy for ASR adaptation in specialized and low resource scenarios.

Co-authors

Paulo V. dos Santos 1

Állan C. P. Silva 1

Anderson da S. Soares 1

Venues

PROPOR1

Fix author