Sumit Negi


2021

pdf bib
Ad Headline Generation using Self-Critical Masked Language Model
Yashal Shakti Kanungo | Sumit Negi | Aruna Rajan
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers

For any E-commerce website it is a nontrivial problem to build enduring advertisements that attract shoppers. It is hard to pass the creative quality bar of the website, especially at a large scale. We thus propose a programmatic solution to generate product advertising headlines using retail content. We propose a state of the art application of Reinforcement Learning (RL) Policy gradient methods on Transformer (Vaswani et al., 2017) based Masked Language Models (Devlin et al., 2019). Our method creates the advertising headline by jointly conditioning on multiple products that a seller wishes to advertise. We demonstrate that our method outperforms existing Transformer and LSTM + RL methods in overlap metrics and quality audits. We also show that our model generated headlines outperform human submitted headlines in terms of both grammar and creative quality as determined by audits.

pdf bib
Training Language Models under Resource Constraints for Adversarial Advertisement Detection
Eshwar Shamanna Girishekar | Shiv Surya | Nishant Nikhil | Dyut Kumar Sil | Sumit Negi | Aruna Rajan
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers

Advertising on e-commerce and social media sites deliver ad impressions at web scale on a daily basis driving value to both shoppers and advertisers. This scale necessitates programmatic ways of detecting unsuitable content in ads to safeguard customer experience and trust. This paper focusses on techniques for training text classification models under resource constraints, built as part of automated solutions for advertising content moderation. We show how weak supervision, curriculum learning and multi-lingual training can be applied effectively to fine-tune BERT and its variants for text classification tasks in conjunction with different data augmentation strategies. Our extensive experiments on multiple languages show that these techniques detect adversarial ad categories with a substantial gain in precision at high recall threshold over the baseline.

2014

pdf bib
Single Document Keyphrase Extraction Using Label Information
Sumit Negi
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2011

pdf bib
Labeling Unlabeled Data using Cross-Language Guided Clustering
Sachindra Joshi | Danish Contractor | Sumit Negi
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Mining bilingual topic hierarchies from unaligned text
Sumit Negi
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
Handling Noisy Queries in Cross Language FAQ Retrieval
Danish Contractor | Govind Kothari | Tanveer Faruquie | L. V. Subramaniam | Sumit Negi
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2009

pdf bib
SMS based Interface for FAQ Retrieval
Govind Kothari | Sumit Negi | Tanveer A. Faruquie | Venkatesan T. Chakaravarthy | L. Venkata Subramaniam
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP