Semi-Supervised Learning with Auxiliary Evaluation Component for Large Scale e-Commerce Text Classification

Mingkuan Liu, Musen Wen, Selcuk Kopru, Xianjing Liu, Alan Lu


Abstract
The lack of high-quality labeled training data has been one of the critical challenges facing many industrial machine learning tasks. To tackle this challenge, in this paper, we propose a semi-supervised learning method to utilize unlabeled data and user feedback signals to improve the performance of ML models. The method employs a primary model Main and an auxiliary evaluation model Eval, where Main and Eval models are trained iteratively by automatically generating labeled data from unlabeled data and/or users’ feedback signals. The proposed approach is applied to different text classification tasks. We report results on both the publicly available Yahoo! Answers dataset and our e-commerce product classification dataset. The experimental results show that the proposed method reduces the classification error rate by 4% and up to 15% across various experimental setups and datasets. A detailed comparison with other semi-supervised learning approaches is also presented later in the paper. The results from various text classification tasks demonstrate that our method outperforms those developed in previous related studies.
Anthology ID:
W18-3409
Volume:
Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP
Month:
July
Year:
2018
Address:
Melbourne
Editors:
Reza Haffari, Colin Cherry, George Foster, Shahram Khadivi, Bahar Salehi
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
68–76
Language:
URL:
https://aclanthology.org/W18-3409
DOI:
10.18653/v1/W18-3409
Bibkey:
Cite (ACL):
Mingkuan Liu, Musen Wen, Selcuk Kopru, Xianjing Liu, and Alan Lu. 2018. Semi-Supervised Learning with Auxiliary Evaluation Component for Large Scale e-Commerce Text Classification. In Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP, pages 68–76, Melbourne. Association for Computational Linguistics.
Cite (Informal):
Semi-Supervised Learning with Auxiliary Evaluation Component for Large Scale e-Commerce Text Classification (Liu et al., ACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-3409.pdf
Software:
 W18-3409.Software.zip
Data
Yahoo! Answers