Lightweight Transformers for Conversational AI

Daniel Pressel, Wenshuo Liu, Michael Johnston, Minhua Chen


Abstract
To understand how training on conversational language impacts performance of pre-trained models on downstream dialogue tasks, we build compact Transformer-based Language Models from scratch on several large corpora of conversational data. We compare the performance and characteristics of these models against BERT and other strong baselines on dialogue probing tasks. Commercial dialogue systems typically require a small footprint and fast execution time, but recent trends are in the other direction, with an ever-increasing number of parameters, resulting in difficulties in model deployment. We focus instead on training fast, lightweight models that excel at natural language understanding (NLU) and can replace existing lower-capacity conversational AI models with similar size and speed. In the process, we develop a simple but unique curriculum-based approach that moves from general-purpose to dialogue-targeted both in terms of data and objective. Our resultant models have around 1/3 the number of parameters of BERT-base and produce better representations for a wide array of intent detection datasets using linear and Mutual-Information probing techniques. Additionally, the models can be easily fine-tuned on a single consumer GPU card and deployed in near real-time production environments.
Anthology ID:
2022.naacl-industry.25
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track
Month:
July
Year:
2022
Address:
Hybrid: Seattle, Washington + Online
Editors:
Anastassia Loukina, Rashmi Gangadharaiah, Bonan Min
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
221–229
Language:
URL:
https://aclanthology.org/2022.naacl-industry.25
DOI:
10.18653/v1/2022.naacl-industry.25
Bibkey:
Cite (ACL):
Daniel Pressel, Wenshuo Liu, Michael Johnston, and Minhua Chen. 2022. Lightweight Transformers for Conversational AI. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, pages 221–229, Hybrid: Seattle, Washington + Online. Association for Computational Linguistics.
Cite (Informal):
Lightweight Transformers for Conversational AI (Pressel et al., NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-industry.25.pdf
Video:
 https://aclanthology.org/2022.naacl-industry.25.mp4
Data
BANKING77C4SentEval