Hita Gupta

2025

Performance Gaps in Acted and Naturalistic Speech: Insights from Speech Emotion Recognition Strategies on Customer Service Calls
Lily Kawaoto | Hita Gupta | Ning Yu | Daniel Dakota
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

Current research in speech emotion recognition (SER) often uses speech data produced by actors which does not always best represent naturalistic speech. This can lead to challenges when applying models trained on such data sources to real-world data. We investigate the application of SER models developed on acted data and more naturalistic podcasts to service call data, with a particular focus on anger detection. Our results indicate that while there is noticeable performance degradation of models trained on acted data to the naturalistic data, weighted multimodal models developed on existing SER datasets–both acted and natural–show promise, but are limited in ability to recognize emotions that do not discernibly cluster.

Co-authors

Venues

RANLP1

Fix author