A Comparative Analysis of Speaker Diarization Models: Creating a Dataset for German Dialectal Speech

Lea Fischbach


Abstract
Speaker diarization is a critical task in the field of computer science, aiming to assign timestamps and speaker labels to audio segments. The aim of these tests in this Publication is to find a pretrained speaker diarization pipeline capable of distinguishing dialectal speakers from each other and an explorer. To achieve this, three pipelines, namely Pyannote, CLEAVER and NeMo, are tested and compared, across various segmentation and parameterization strategies. The study considers multiple scenarios, such as the impact of threshold values, overlap handling, and minimum duration parameters, on classification accuracy. Additionally, this study aims to create a dataset for German dialect identification (DID) based on the findings from this research.
Anthology ID:
2024.fieldmatters-1.6
Volume:
Proceedings of the 3rd Workshop on NLP Applications to Field Linguistics (Field Matters 2024)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Oleg Serikov, Ekaterina Voloshina, Anna Postnikova, Saliha Muradoglu, Eric Le Ferrand, Elena Klyachko, Ekaterina Vylomova, Tatiana Shavrina, Francis Tyers
Venues:
FieldMatters | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
43–51
Language:
URL:
https://aclanthology.org/2024.fieldmatters-1.6
DOI:
Bibkey:
Cite (ACL):
Lea Fischbach. 2024. A Comparative Analysis of Speaker Diarization Models: Creating a Dataset for German Dialectal Speech. In Proceedings of the 3rd Workshop on NLP Applications to Field Linguistics (Field Matters 2024), pages 43–51, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
A Comparative Analysis of Speaker Diarization Models: Creating a Dataset for German Dialectal Speech (Fischbach, FieldMatters-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.fieldmatters-1.6.pdf