Weakly supervised hierarchical multi-task classification of customer questions

Jitenkumar Rana, Promod Yenigalla, Chetan Aggarwal, Sandeep Sricharan Mukku, Manan Soni, Rashmi Patange


Abstract
Identifying granular and actionable topics from customer questions (CQ) posted on e-commerce websites helps surface the missing information expected by customers on the product detail page (DP), provide insights to brands and sellers on what critical product information that the customers are looking before making a purchase decision and helps enrich the catalog quality to improve the overall customer experience (CX). We propose a weakly supervised Hierarchical Multi-task Classification Framework (HMCF) to identify topics from customer questions at various granularities. Complexity lies in creating a list of granular topics (taxonomy) for 1000s of product categories and building a scalable classification system. To this end, we introduce a clustering based Taxonomy Creation and Data Labeling (TCDL) module for creating taxonomy and labelled data with minimal supervision. Using TCDL module, taxonomy and labelled data creation task reduces to 2 hours as compared to 2 weeks of manual efforts by a subject matter expert. For classification, we propose a two level HMCF that performs multi-class classification to identify coarse level-1 topic and leverages NLI based label-aware approach to identify granular level-2 topic. We showcase that HMCF (based on BERT and NLI) a) achieves absolute improvement of 13% in Top-1 accuracy over single-task non-hierarchical baselines b) learns a generic domain invariant function that can adapt to constantly evolving taxonomy (open label set) without need of re-training. c) reduces model deployment efforts significantly since it needs only one model that caters to 1000s of product categories.
Anthology ID:
2023.acl-industry.75
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Sunayana Sitaram, Beata Beigman Klebanov, Jason D Williams
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
786–793
Language:
URL:
https://aclanthology.org/2023.acl-industry.75
DOI:
10.18653/v1/2023.acl-industry.75
Bibkey:
Cite (ACL):
Jitenkumar Rana, Promod Yenigalla, Chetan Aggarwal, Sandeep Sricharan Mukku, Manan Soni, and Rashmi Patange. 2023. Weakly supervised hierarchical multi-task classification of customer questions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 786–793, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Weakly supervised hierarchical multi-task classification of customer questions (Rana et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-industry.75.pdf