Data Quality Estimation Framework for Faster Tax Code Classification

Ravikumar Kondadadi; Allen Williams; Nicolas Nicolov

doi:10.18653/v1/2022.ecnlp-1.4

Data Quality Estimation Framework for Faster Tax Code Classification

Ravi Kondadadi, Allen Williams, Nicolas Nicolov

Abstract

This paper describes a novel framework to estimate the data quality of a collection of product descriptions to identify required relevant information for accurate product listing classification for tax-code assignment. Our Data Quality Estimation (DQE) framework consists of a Question Answering (QA) based attribute value extraction model to identify missing attributes and a classification model to identify bad quality records. We show that our framework can accurately predict the quality of product descriptions. In addition to identifying low-quality product listings, our framework can also generate a detailed report at a category level showing missing product information resulting in a better customer experience.

Anthology ID:: 2022.ecnlp-1.4
Volume:: Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5)
Month:: May
Year:: 2022
Address:: Dublin, Ireland
Editors:: Shervin Malmasi, Oleg Rokhlenko, Nicola Ueffing, Ido Guy, Eugene Agichtein, Surya Kallumadi
Venue:: ECNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 29–34
Language:
URL:: https://aclanthology.org/2022.ecnlp-1.4
DOI:: 10.18653/v1/2022.ecnlp-1.4
Bibkey:
Cite (ACL):: Ravi Kondadadi, Allen Williams, and Nicolas Nicolov. 2022. Data Quality Estimation Framework for Faster Tax Code Classification. In Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5), pages 29–34, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):: Data Quality Estimation Framework for Faster Tax Code Classification (Kondadadi et al., ECNLP 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.ecnlp-1.4.pdf
Video:: https://aclanthology.org/2022.ecnlp-1.4.mp4
Data: MAVE

PDF Cite Search Video