Label efficient semi-supervised conversational intent classification

Mandar Kulkarni, Kyung Kim, Nikesh Garera, Anusua Trivedi


Abstract
To provide a convenient shopping experience and to answer user queries at scale, conversational platforms are essential for e-commerce. The user queries can be pre-purchase questions, such as product specifications and delivery time related, or post-purchase queries, such as exchange and return. A chatbot should be able to understand and answer a variety of such queries to help users with relevant information. One of the important modules in the chatbot is automated intent identification, i.e., understanding the user’s intention from the query text. Due to non-English speaking users interacting with the chatbot, we often get a significant percentage of code mix queries and queries with grammatical errors, which makes the problem more challenging. This paper proposes a simple yet competent Semi-Supervised Learning (SSL) approach for label-efficient intent classification. We use a small labeled corpus and relatively larger unlabeled query data to train a transformer model. For training the model with labeled data, we explore supervised MixUp data augmentation. To train with unlabeled data, we explore label consistency with dropout noise. We experiment with different pre-trained transformer architectures, such as BERT and sentence-BERT. Experimental results demonstrate that the proposed approach significantly improves over the supervised baseline, even with a limited labeled set. A variant of the model is currently deployed in production.
Anthology ID:
2023.acl-industry.11
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Sunayana Sitaram, Beata Beigman Klebanov, Jason D Williams
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
96–102
Language:
URL:
https://aclanthology.org/2023.acl-industry.11
DOI:
10.18653/v1/2023.acl-industry.11
Bibkey:
Cite (ACL):
Mandar Kulkarni, Kyung Kim, Nikesh Garera, and Anusua Trivedi. 2023. Label efficient semi-supervised conversational intent classification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 96–102, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Label efficient semi-supervised conversational intent classification (Kulkarni et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-industry.11.pdf