Exploring Inspiration Sets in a Data Programming Pipeline for Product Moderation

Justine Winkler, Simon Brugman, Bas van Berkel, Martha Larson


Abstract
We carry out a case study on the use of data programming to create data to train classifiers used for product moderation on a large e-commerce platform. Data programming is a recently-introduced technique that uses human-defined rules to generate training data sets without tedious item-by-item hand labeling. Our study investigates methods for allowing product moderators to quickly modify the rules given their knowledge of the domain and, especially, of textual item descriptions. Our results show promise that moderators can use this approach to steer the training data, making possible fast and close control of classifiers that detect policy violations.
Anthology ID:
2021.ecnlp-1.16
Volume:
Proceedings of the 4th Workshop on e-Commerce and NLP
Month:
August
Year:
2021
Address:
Online
Editors:
Shervin Malmasi, Surya Kallumadi, Nicola Ueffing, Oleg Rokhlenko, Eugene Agichtein, Ido Guy
Venue:
ECNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
132–139
Language:
URL:
https://aclanthology.org/2021.ecnlp-1.16
DOI:
10.18653/v1/2021.ecnlp-1.16
Bibkey:
Cite (ACL):
Justine Winkler, Simon Brugman, Bas van Berkel, and Martha Larson. 2021. Exploring Inspiration Sets in a Data Programming Pipeline for Product Moderation. In Proceedings of the 4th Workshop on e-Commerce and NLP, pages 132–139, Online. Association for Computational Linguistics.
Cite (Informal):
Exploring Inspiration Sets in a Data Programming Pipeline for Product Moderation (Winkler et al., ECNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.ecnlp-1.16.pdf