Dial2vec: Self-Guided Contrastive Learning of Unsupervised Dialogue Embeddings

Che Liu, Rui Wang, Junfeng Jiang, Yongbin Li, Fei Huang


Abstract
In this paper, we introduce the task of learning unsupervised dialogue embeddings.Trivial approaches such as combining pre-trained word or sentence embeddings and encoding through pre-trained language models (PLMs) have been shown to be feasible for this task. However, these approaches typically ignore the conversational interactions between interlocutors, resulting in poor performance. To address this issue, we proposed a self-guided contrastive learning approach named dial2vec.Dial2vec considers a dialogue as an information exchange process. It captures the interaction patterns between interlocutors and leverages them to guide the learning of the embeddings corresponding to each interlocutor. Then the dialogue embedding is obtained by an aggregation of the embeddings from all interlocutors. To verify our approach, we establish a comprehensive benchmark consisting of six widely-used dialogue datasets. We consider three evaluation tasks: domain categorization, semantic relatedness, and dialogue retrieval. Dial2vec achieves on average 8.7, 9.0, and 13.8 points absolute improvements in terms of purity, Spearman’s correlation, and mean average precision (MAP) over the strongest baseline on the three tasks respectively. Further analysis shows that dial2vec obtains informative and discriminative embeddings for both interlocutors under the guidance of the conversational interactions and achieves the best performance when aggregating them through the interlocutor-level pooling strategy. All codes and data are publicly available at https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/dial2vec.
Anthology ID:
2022.emnlp-main.490
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7272–7282
Language:
URL:
https://aclanthology.org/2022.emnlp-main.490
DOI:
10.18653/v1/2022.emnlp-main.490
Bibkey:
Cite (ACL):
Che Liu, Rui Wang, Junfeng Jiang, Yongbin Li, and Fei Huang. 2022. Dial2vec: Self-Guided Contrastive Learning of Unsupervised Dialogue Embeddings. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7272–7282, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Dial2vec: Self-Guided Contrastive Learning of Unsupervised Dialogue Embeddings (Liu et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.490.pdf