Semantic-based Pre-training for Dialogue Understanding

Xuefeng Bai, Linfeng Song, Yue Zhang


Abstract
Pre-trained language models have made great progress on dialogue tasks. However, these models are typically trained on surface dialogue text, thus are proven to be weak in understanding the main semantic meaning of a dialogue context. We investigate Abstract Meaning Representation (AMR) as explicit semantic knowledge for pre-training models to capture the core semantic information in dialogues during pre-training. In particular, we propose a semantic-based pre-training framework that extends the standard pre-training framework (Devlin et al.,2019) by three tasks for learning 1) core semantic units, 2) semantic relations and 3) the overall semantic representation according to AMR graphs. Experiments on the understanding of both chit-chats and task-oriented dialogues show the superiority of our model. To our knowledge, we are the first to leverage a deep semantic representation for dialogue pre-training.
Anthology ID:
2022.coling-1.49
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
592–607
Language:
URL:
https://aclanthology.org/2022.coling-1.49
DOI:
Bibkey:
Cite (ACL):
Xuefeng Bai, Linfeng Song, and Yue Zhang. 2022. Semantic-based Pre-training for Dialogue Understanding. In Proceedings of the 29th International Conference on Computational Linguistics, pages 592–607, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Semantic-based Pre-training for Dialogue Understanding (Bai et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.49.pdf
Code
 goodbai-nlp/sem-plm
Data
CLINC150DialoGLUEDialogRE