Effects of Speaker Bias in Dialect Identification and Automatic Transcription with Self-Supervised Speech Models

Olli Kuparinen

Effects of Speaker Bias in Dialect Identification and Automatic Transcription with Self-Supervised Speech Models

Abstract

A major issue in audio modeling is speaker bias, in which the models learn language external traits, such as a speaker’s timbre or pitch, and use this information as a shortcut to a language task. This is especially problematic for dialectology, as it is typical in dialect corpora that only a few speakers represent a complete dialect area. In this paper, we explore the effects of speaker bias in two dialectal tasks: dialect identification and automatic dialectal transcription. We build two different data partitions of dialect interviews in Finnish and Norwegian: 1) a speaker dependent partition in which all of the speakers appear in training, development, and test sets, and 2) a speaker independent partition where each speaker only appears in exactly one set. We further experiment with modifications of the training data by augmenting the original audio with pitch shifts and noise, as well as changing the original speakers’ voices with voice conversion models. We show that the dialect identification models are highly affected by speaker bias, whereas automatic dialectal transcription models are not. The audio modifications do not offer major performance gains for either of the languages or tasks.

Anthology ID:: 2026.vardial-1.3
Volume:: Proceedings of the 13th Workshop on NLP for Similar Languages, Varieties and Dialects
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Venues:: VarDial | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 32–44
Language:
URL:: https://aclanthology.org/2026.vardial-1.3/
DOI:
Bibkey:
Cite (ACL):: Olli Kuparinen. 2026. Effects of Speaker Bias in Dialect Identification and Automatic Transcription with Self-Supervised Speech Models. In Proceedings of the 13th Workshop on NLP for Similar Languages, Varieties and Dialects, pages 32–44, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Effects of Speaker Bias in Dialect Identification and Automatic Transcription with Self-Supervised Speech Models (Kuparinen, VarDial 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.vardial-1.3.pdf

PDF Cite Search Fix data