Edresson Casanova


2025

pdf bib
MuPe Life Stories Dataset: Spontaneous Speech in Brazilian Portuguese with a Case Study Evaluation on ASR Bias against Speakers Groups and Topic Modeling
Sidney Evaldo Leal | Arnaldo Candido Junior | Ricardo Marcacini | Edresson Casanova | Odilon Gonçalves | Anderson Silva Soares | Rodrigo Freitas Lima | Lucas Rafael Stefanel Gris | Sandra Aluísio
Proceedings of the 31st International Conference on Computational Linguistics

Recently, several public datasets for automatic speech recognition (ASR) in Brazilian Portuguese (BP) have been released, improving ASR systems performance. However, these datasets lack diversity in terms of age groups, regional accents, and education levels. In this paper, we present a new publicly available dataset consisting of 289 life story interviews (365 hours), featuring a broad range of speakers varying in age, education, and regional accents. First, we demonstrated the presence of bias in current BP ASR models concerning education levels and age groups. Second, we showed that our dataset helps mitigate these biases. Additionally, an ASR model trained on our dataset performed better during evaluation on a diverse test set. Finally, the ASR model trained with our dataset was extrinsically evaluated through a topic modeling task that utilized the automatically transcribed output.

2024

pdf bib
TTS applied to the generation of datasets for automatic speech recognition
Edresson Casanova | Sandra Aluísio | Moacir Antonelli Ponti
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1

2021

pdf bib
Deep Learning against COVID-19: Respiratory Insufficiency Detection in Brazilian Portuguese Speech
Edresson Casanova | Lucas Gris | Augusto Camargo | Daniel da Silva | Murilo Gazzola | Ester Sabino | Anna Levin | Arnaldo Candido Jr | Sandra Aluisio | Marcelo Finger
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Evaluating Sentence Segmentation in Different Datasets of Neuropsychological Language Tests in Brazilian Portuguese
Edresson Casanova | Marcos Treviso | Lilian Hübner | Sandra Aluísio
Proceedings of the Twelfth Language Resources and Evaluation Conference

Automatic analysis of connected speech by natural language processing techniques is a promising direction for diagnosing cognitive impairments. However, some difficulties still remain: the time required for manual narrative transcription and the decision on how transcripts should be divided into sentences for successful application of parsers used in metrics, such as Idea Density, to analyze the transcripts. The main goal of this paper was to develop a generic segmentation system for narratives of neuropsychological language tests. We explored the performance of our previous single-dataset-trained sentence segmentation architecture in a richer scenario involving three new datasets used to diagnose cognitive impairments, comprising different stories and two types of stimulus presentation for eliciting narratives — visual and oral — via illustrated story-book and sequence of scenes, and by retelling. Also, we proposed and evaluated three modifications to our previous RCNN architecture: (i) the inclusion of a Linear Chain CRF; (ii) the inclusion of a self-attention mechanism; and (iii) the replacement of the LSTM recurrent layer by a Quasi-Recurrent Neural Network layer. Our study allowed us to develop two new models for segmenting impaired speech transcriptions, along with an ideal combination of datasets and specific groups of narratives to be used as the training set.