Automatic Speech Interruption Detection: Analysis, Corpus, and System

Martin Lebourdais, Marie Tahon, Antoine Laurent, Sylvain Meignier


Abstract
Interruption detection is a new yet challenging task in the field of speech processing. This article presents a comprehensive study on automatic speech interruption detection, from the definition of this task, the assembly of a specialized corpus, and the development of an initial baseline system. We provide three main contributions: Firstly, we define the task, taking into account the nuanced nature of interruptions within spontaneous conversations. Secondly, we introduce a new corpus of conversational data, annotated for interruptions, to facilitate research in this domain. This corpus serves as a valuable resource for evaluating and advancing interruption detection techniques. Lastly, we present a first baseline system, which use speech processing methods to automatically identify interruptions in speech with promising results. In this article, we derivate from theoretical notions of interruption to build a simplification of this notion based on overlapped speech detection. Our findings can not only serve as a foundation for further research in the field but also provide a benchmark for assessing future advancements in automatic speech interruption detection.
Anthology ID:
2024.lrec-main.176
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
1959–1968
Language:
URL:
https://aclanthology.org/2024.lrec-main.176
DOI:
Bibkey:
Cite (ACL):
Martin Lebourdais, Marie Tahon, Antoine Laurent, and Sylvain Meignier. 2024. Automatic Speech Interruption Detection: Analysis, Corpus, and System. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 1959–1968, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Automatic Speech Interruption Detection: Analysis, Corpus, and System (Lebourdais et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.176.pdf