Happy Mittal


pdf bib
Large Scale Generative Multimodal Attribute Extraction for E-commerce Attributes
Anant Khandelwal | Happy Mittal | Shreyas Kulkarni | Deepak Gupta
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)

E-commerce websites (e.g. Amazon, Alibaba) have a plethora of structured and unstructured information (text and images) present on the product pages. Sellers often don’t label or mislabel values of the attributes (e.g. color, size etc.) for their products. Automatically identifying these attribute values from an eCommerce product page that contains both text and images is a challenging task, especially when the attribute value is not explicitly mentioned in the catalog. In this paper, we present a scalable solution for this problem where we pose attribute extraction problem as a question-answering task, which we solve using MXT, that consists of three key components: (i) MAG (Multimodal Adaptation Gate), (ii) Xception network, and (iii) T5 encoder-decoder. Our system consists of a generative model that generates attribute-values for a given product by using both textual and visual characteristics (e.g. images) of the product. We show that our system is capable of handling zero-shot attribute prediction (when attribute value is not seen in training data) and value-absent prediction (when attribute value is not mentioned in the text) which are missing in traditional classification-based and NER-based models respectively. We have trained our models using distant supervision, removing dependency on human labeling, thus making them practical for real-world applications. With this framework, we are able to train a single model for 1000s of (product-type, attribute) pairs, thus reducing the overhead of training and maintaining separate models. Extensive experiments on two real world datasets (total 57 attributes) show that our framework improves the absolute recall@90P by 10.16% and 6.9 from the existing state of the art models. In a popular e-commerce store, we have productionized our models that cater to 12K (product-type, attribute) pairs, and have extracted 150MM attribute values.


pdf bib
Distantly Supervised Transformers For E-Commerce Product QA
Happy Mittal | Aniket Chakrabarti | Belhassen Bayar | Animesh Anant Sharma | Nikhil Rasiwasia
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

We propose a practical instant question answering (QA) system on product pages of e-commerce services, where for each user query, relevant community question answer (CQA) pairs are retrieved. User queries and CQA pairs differ significantly in language characteristics making relevance learning difficult. Our proposed transformer-based model learns a robust relevance function by jointly learning unified syntactic and semantic representations without the need for human labeled data. This is achieved by distantly supervising our model by distilling from predictions of a syntactic matching system on user queries and simultaneously training with CQA pairs. Training with CQA pairs helps our model learning semantic QA relevance and distant supervision enables learning of syntactic features as well as the nuances of user querying language. Additionally, our model encodes queries and candidate responses independently allowing offline candidate embedding generation thereby minimizing the need for real-time transformer model execution. Consequently, our framework is able to scale to large e-commerce QA traffic. Extensive evaluation on user queries shows that our framework significantly outperforms both syntactic and semantic baselines in offline as well as large scale online A/B setups of a popular e-commerce service.