PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

Siqi Bao; Huang He; Fan Wang; Hua Wu (吴华); Haifeng Wang; Wenquan Wu; Zhihua Wu; Zhen Guo; Hua Lu; Xinxian Huang; Xin Tian; Xinchao Xu; Yingzhan Lin; Zheng-Yu Niu

doi:10.18653/v1/2022.findings-aacl.10

PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang, Wenquan Wu, Zhihua Wu, Zhen Guo, Hua Lu, Xinxian Huang, Xin Tian, Xinchao Xu, Yingzhan Lin, Zheng-Yu Niu

Abstract

To explore the limit of dialogue generation pre-training, we present the models of PLATO-XL with up to 11 billion parameters, trained on both Chinese and English social media conversations. To train such large models, we adopt the architecture of unified transformer with high computation and parameter efficiency. In addition, we carry out multi-party aware pre-training to better distinguish the characteristic information in social media conversations. With such designs, PLATO-XL successfully achieves superior performances as compared to other approaches in both Chinese and English chitchat. We further explore the capacity of PLATO-XL on other conversational tasks, such as knowledge grounded dialogue and task-oriented conversation. The experimental results indicate that PLATO-XL obtains state-of-the-art results across multiple conversational tasks, verifying its potential as a foundation model of conversational AI.

Anthology ID:: 2022.findings-aacl.10
Volume:: Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022
Month:: November
Year:: 2022
Address:: Online only
Editors:: Yulan He, Heng Ji, Sujian Li, Yang Liu, Chua-Hui Chang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 107–118
Language:
URL:: https://aclanthology.org/2022.findings-aacl.10/
DOI:: 10.18653/v1/2022.findings-aacl.10
Bibkey:
Cite (ACL):: Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang, Wenquan Wu, Zhihua Wu, Zhen Guo, Hua Lu, Xinxian Huang, Xin Tian, Xinchao Xu, Yingzhan Lin, and Zheng-Yu Niu. 2022. PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation. In Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, pages 107–118, Online only. Association for Computational Linguistics.
Cite (Informal):: PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation (Bao et al., Findings 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.findings-aacl.10.pdf

PDF Cite Search Fix data