Learn to Copy from the Copying History: Correlational Copy Network for Abstractive Summarization
Haoran Li | Song Xu | Peng Yuan | Yujia Wang | Youzheng Wu | Xiaodong He | Bowen Zhou
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
The copying mechanism has had considerable success in abstractive summarization, facilitating models to directly copy words from the input text to the output summary. Existing works mostly employ encoder-decoder attention, which applies copying at each time step independently of the former ones. However, this may sometimes lead to incomplete copying. In this paper, we propose a novel copying scheme named Correlational Copying Network (CoCoNet) that enhances the standard copying mechanism by keeping track of the copying history. It thereby takes advantage of prior copying distributions and, at each time step, explicitly encourages the model to copy the input word that is relevant to the previously copied one. In addition, we strengthen CoCoNet through pre-training with suitable corpora that simulate the copying behaviors. Experimental results show that CoCoNet can copy more accurately and achieves new state-of-the-art performances on summarization benchmarks, including CNN/DailyMail for news summarization and SAMSum for dialogue summarization. The code and checkpoint will be publicly available.
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce
Song Xu | Haoran Li | Peng Yuan | Yujia Wang | Youzheng Wu | Xiaodong He | Ying Liu | Bowen Zhou
Findings of the Association for Computational Linguistics: EMNLP 2021
Existing pre-trained language models (PLMs) have demonstrated the effectiveness of self-supervised learning for a broad range of natural language processing (NLP) tasks. However, most of them are not explicitly aware of domain-specific knowledge, which is essential for downstream tasks in many domains, such as tasks in e-commerce scenarios. In this paper, we propose K-PLUG, a knowledge-injected pre-trained language model based on the encoder-decoder transformer that can be transferred to both natural language understanding and generation tasks. Specifically, we propose five knowledge-aware self-supervised pre-training objectives to formulate the learning of domain-specific knowledge, including e-commerce domain-specific knowledge-bases, aspects of product entities, categories of product entities, and unique selling propositions of product entities. We verify our method in a diverse range of e-commerce scenarios that require domain-specific knowledge, including product knowledge base completion, abstractive product summarization, and multi-turn dialogue. K-PLUG significantly outperforms baselines across the board, which demonstrates that the proposed method effectively learns a diverse set of domain-specific knowledge for both language understanding and generation tasks. Our code is available.
- Haoran Li 2
- Song Xu 2
- Peng Yuan 2
- Youzheng Wu 2
- Xiaodong He 2
- show all...