Daniel R. da Silva


2026

Automatic Speech Recognition (ASR) systems require large amounts of annotated speech, which are difficult to obtain in specialized domains. This paper introduces GARAGEM: General Automotive Real and Artificial speech corpus for Garage Environments and Maintenance in brazilian portuguese, a domain specific ASR dataset for Brazilian Portuguese focused on automotive repair, combining real speech collected from online sources with synthetic speech generated from curated technical terminology. A reproducible methodology is proposed, encompassing real data acquisition, domain guided synthetic data generation, dataset consolidation, and ASR model fine-tuning. Experiments conducted with the Whisper, Wav2vec 2.0, and Conformer models show that synthetic data provides improvements when used to complement real recordings. Quantitative and qualitative analyses show reductions in Word Error Rate (WER) and Character Error Rate (CER) and improved recognition of domain specific terms absent from the real training set. The results indicate that domain guided synthetic speech is an effective data augmentation strategy for ASR adaptation in specialized and low resource scenarios.