Exploring Speaker-Related Information in Spoken Language Understanding for Better Speaker Diarization

Luyao Cheng; Siqi Zheng; Zhang Qinglin; Hui Wang; Yafeng Chen; Qian Chen

doi:10.18653/v1/2023.findings-acl.884

Exploring Speaker-Related Information in Spoken Language Understanding for Better Speaker Diarization

Luyao Cheng, Siqi Zheng, Zhang Qinglin, Hui Wang, Yafeng Chen, Qian Chen

Abstract

Speaker diarization is a classic task in speech processing and is crucial in multi-party scenarios such as meetings and conversations. Current mainstream speaker diarization approaches consider acoustic information only, which result in performance degradation when encountering adverse acoustic environment. In this paper, we propose methods to extract speaker-related information from semantic content in multi-party meetings, which, as we will show, can further benefit speaker diarization. We introduce two sub-tasks, Dialogue Detection and Speaker-Turn Detection, in which we effectively extract speaker information from conversational semantics. We also propose a simple yet effective algorithm to jointly model acoustic and semantic information and obtain speaker-identified texts. Experiments on both AISHELL-4 and AliMeeting datasets show that our method achieves consistent improvements over acoustic-only speaker diarization systems.

Anthology ID:: 2023.findings-acl.884
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 14068–14077
Language:
URL:: https://aclanthology.org/2023.findings-acl.884
DOI:: 10.18653/v1/2023.findings-acl.884
Bibkey:
Cite (ACL):: Luyao Cheng, Siqi Zheng, Zhang Qinglin, Hui Wang, Yafeng Chen, and Qian Chen. 2023. Exploring Speaker-Related Information in Spoken Language Understanding for Better Speaker Diarization. In Findings of the Association for Computational Linguistics: ACL 2023, pages 14068–14077, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Exploring Speaker-Related Information in Spoken Language Understanding for Better Speaker Diarization (Cheng et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.884.pdf
Video:: https://aclanthology.org/2023.findings-acl.884.mp4

PDF Cite Search Video