Vincenzo Norman Vitale

2025

Toward Optimised Datasets to Fine-tune ASR Systems Leveraging Less but More Informative Speech
Loredana Schettino | Vincenzo Norman Vitale | Alessandro Vietti
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)

pdf bib

Using End-to-End Automatic Speech Recognizers’ Internals to Model Disfluencies in Italian Patients with Early-stage Parkinson’s Disease
Loredana Schettino | Vincenzo Norman Vitale | Marta Maffia
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)

2024

pdf bib abs

Modelling Filled Particles and Prolongation Using End-to-end Automatic Speech Recognition Systems: A Quantitative and Qualitative Analysis.
Vincenzo Norman Vitale | Loredana Schettino | Francesco Cutugno
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)

State-of-the-art automatic speech recognition systems based on End-to-End models (E2E-ASRs) achieve remarkable perfor mances. However, phenomena that characterize spoken language such as fillers (eeh ehm) or segmental prolongations (theee) are still mostly considered as disrupting objects that should not be included to obtain optimal transcriptions, despite their acknowledged regularity and communicative value. A recent study showed that two types of pre-trained systems with the same Conformer-based encoding architecture but different decoders – a Connectionist Temporal Classification (CTC) decoder and a Transducer decoder – tend to model some speech features that are functional for the identification of filled pauses and prolongation in speech. This work builds upon these findings by investigating which of the two systems is better at fillers and prolongations detection tasks and by conducting an error analysis to deepen our understanding of how these systems work.

2023

pdf bib

Automatic Detection of Parkinson’s Disease with Connected Speech Acoustic Features: Towards a Linguistically Interpretable Approach
Marta Maffia | Loredana Schettino | Vincenzo Norman Vitale
Proceedings of the Ninth Italian Conference on Computational Linguistics (CLiC-it 2023)

pdf bib

On Incrementing Interpretability of Machine Learning Models from the Foundations: A Study on Syllabic Speech Units
Vincenzo Norman Vitale | Loredana Schettino | Francesco Cutugno
Proceedings of the Ninth Italian Conference on Computational Linguistics (CLiC-it 2023)

Co-authors

Venues

CLiC-it5

Fix author