Inference-Only Speaker Adaptation Improves Cross-Lingual Speech Emotion Recognition

Maciej Łachut


Abstract
Cross-lingual Speech Emotion Recognition (SER) is frequently hindered by speaker-specific prosodic variations that obscure universal emotional cues. Standard models often fail to generalize across languages due to the domain shift caused by differing acoustic standards. To address this, we present a novel SER approach that integrates unsupervised speaker adaptation directly at inference time. Our architecture utilizes a frozen, pretrained HuBERT encoder and introduces a Greedy Cluster Assignment Algorithm. This method groups a speaker’s utterances to form emotion-dependent centroids, enforcing speaker-consistent labeling without the computational cost of retraining. We evaluated this approach in a cross-lingual setting using the Polish nEMO dataset, which was excluded from training. Our method achieved the best performance in the POL-EVAL 2025 Task 4, improving the Macro F1 score from 0.619 to 0.753 on validation data and securing 1st place on the official leaderboard. Results demonstrate that inference-only clustering effectively disentangles ambiguous high-arousal categories, such as Fear and Surprise, by calibrating to the individual speaker’s vocal range.
Anthology ID:
2025.poleval-main.12
Volume:
Proceedings of the PolEval 2025 Workshop
Month:
November
Year:
2025
Address:
Warsaw
Editors:
Łukasz Kobyliński, Alina Wróblewska, Maciej Ogrodniczuk
Venues:
PolEval | WS
SIG:
Publisher:
Institute of Computer Science PAS and Association for Computational Linguistics
Note:
Pages:
82–90
Language:
URL:
https://aclanthology.org/2025.poleval-main.12/
DOI:
Bibkey:
Cite (ACL):
Maciej Łachut. 2025. Inference-Only Speaker Adaptation Improves Cross-Lingual Speech Emotion Recognition. In Proceedings of the PolEval 2025 Workshop, pages 82–90, Warsaw. Institute of Computer Science PAS and Association for Computational Linguistics.
Cite (Informal):
Inference-Only Speaker Adaptation Improves Cross-Lingual Speech Emotion Recognition (Łachut, PolEval 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.poleval-main.12.pdf