Improving Unsupervised Dialogue Topic Segmentation with Utterance-Pair Coherence Scoring

Linzi Xing, Giuseppe Carenini


Abstract
Dialogue topic segmentation is critical in several dialogue modeling problems. However, popular unsupervised approaches only exploit surface features in assessing topical coherence among utterances. In this work, we address this limitation by leveraging supervisory signals from the utterance-pair coherence scoring task. First, we present a simple yet effective strategy to generate a training corpus for utterance-pair coherence scoring. Then, we train a BERT-based neural utterance-pair coherence model with the obtained training corpus. Finally, such model is used to measure the topical relevance between utterances, acting as the basis of the segmentation inference. Experiments on three public datasets in English and Chinese demonstrate that our proposal outperforms the state-of-the-art baselines.
Anthology ID:
2021.sigdial-1.18
Volume:
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:
July
Year:
2021
Address:
Singapore and Online
Editors:
Haizhou Li, Gina-Anne Levow, Zhou Yu, Chitralekha Gupta, Berrak Sisman, Siqi Cai, David Vandyke, Nina Dethlefs, Yan Wu, Junyi Jessy Li
Venue:
SIGDIAL
SIG:
SIGDIAL
Publisher:
Association for Computational Linguistics
Note:
Pages:
167–177
Language:
URL:
https://aclanthology.org/2021.sigdial-1.18
DOI:
10.18653/v1/2021.sigdial-1.18
Bibkey:
Cite (ACL):
Linzi Xing and Giuseppe Carenini. 2021. Improving Unsupervised Dialogue Topic Segmentation with Utterance-Pair Coherence Scoring. In Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 167–177, Singapore and Online. Association for Computational Linguistics.
Cite (Informal):
Improving Unsupervised Dialogue Topic Segmentation with Utterance-Pair Coherence Scoring (Xing & Carenini, SIGDIAL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.sigdial-1.18.pdf
Video:
 https://www.youtube.com/watch?v=04Urc5LRBlk
Code
 lxing532/Dialogue-Topic-Segmenter
Data
DailyDialogDoc2Dialdoc2dial