Pradhuman Jhala


pdf bib
Product Classification in E-Commerce using Distributional Semantics
Vivek Gupta | Harish Karnick | Ashendra Bansal | Pradhuman Jhala
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Product classification is the task of automatically predicting a taxonomy path for a product in a predefined taxonomy hierarchy given a textual product description or title. For efficient product classification we require a suitable representation for a document (the textual description of a product) feature vector and efficient and fast algorithms for prediction. To address the above challenges, we propose a new distributional semantics representation for document vector formation. We also develop a new two-level ensemble approach utilising (with respect to the taxonomy tree) path-wise, node-wise and depth-wise classifiers to reduce error in the final product classification task. Our experiments show the effectiveness of the distributional representation and the ensemble approach on data sets from a leading e-commerce platform and achieve improved results on various evaluation metrics compared to earlier approaches.