CULG: Commercial Universal Language Generation

Haonan Li; Yameng Huang; Yeyun Gong; Jian Jiao; Ruofei Zhang; Timothy Baldwin; Nan Duan

doi:10.18653/v1/2022.naacl-industry.14

CULG: Commercial Universal Language Generation

Haonan Li, Yameng Huang, Yeyun Gong, Jian Jiao, Ruofei Zhang, Timothy Baldwin, Nan Duan

Abstract

Pre-trained language models (PLMs) have dramatically improved performance for many natural language processing (NLP) tasks in domains such as finance and healthcare. However, the application of PLMs in the domain of commerce, especially marketing and advertising, remains less studied. In this work, we adapt pre-training methods to the domain of commerce, by proposing CULG, a large-scale commercial universal language generation model which is pre-trained on a corpus drawn from 10 markets across 7 languages. We propose 4 commercial generation tasks and a two-stage training strategy for pre-training, and demonstrate that the proposed strategy yields performance improvements on three generation tasks as compared to single-stage pre-training. Extensive experiments show that our model outperforms other models by a large margin on commercial generation tasks, and we conclude with a discussion on additional applications over other markets, languages, and tasks.

Anthology ID:: 2022.naacl-industry.14
Volume:: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track
Month:: July
Year:: 2022
Address:: Hybrid: Seattle, Washington + Online
Editors:: Anastassia Loukina, Rashmi Gangadharaiah, Bonan Min
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 112–120
Language:
URL:: https://aclanthology.org/2022.naacl-industry.14/
DOI:: 10.18653/v1/2022.naacl-industry.14
Bibkey:
Cite (ACL):: Haonan Li, Yameng Huang, Yeyun Gong, Jian Jiao, Ruofei Zhang, Timothy Baldwin, and Nan Duan. 2022. CULG: Commercial Universal Language Generation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, pages 112–120, Hybrid: Seattle, Washington + Online. Association for Computational Linguistics.
Cite (Informal):: CULG: Commercial Universal Language Generation (Li et al., NAACL 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.naacl-industry.14.pdf

PDF Cite Search Fix data