Retrieval-Enhanced Dual Encoder Training for Product Matching

Justin Chiu

doi:10.18653/v1/2023.emnlp-industry.22

Retrieval-Enhanced Dual Encoder Training for Product Matching

Abstract

Product matching is the task of matching a seller-listed item to an appropriate product. It is a critical task for an e-commerce platform, and the approach needs to be efficient to run in a large-scale setting. A dual encoder approach has been a common practice for product matching recently, due to its high performance and computation efficiency. In this paper, we propose a two-stage training for the dual encoder model. Stage 1 trained a dual encoder to identify the more informative training data. Stage 2 then train on the more informative data to get a better dual encoder model. This technique is a learned approach for building training data. We evaluate the retrieval-enhanced training on two different datasets: a publicly available Large-Scale Product Matching dataset and a real-world e-commerce dataset containing 47 million products. Experiment results show that our approach improved by 2% F1 on the public dataset and 9% F1 on the real-world e-commerce dataset.

Anthology ID:: 2023.emnlp-industry.22
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Mingxuan Wang, Imed Zitouni
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 216–222
Language:
URL:: https://aclanthology.org/2023.emnlp-industry.22/
DOI:: 10.18653/v1/2023.emnlp-industry.22
Bibkey:
Cite (ACL):: Justin Chiu. 2023. Retrieval-Enhanced Dual Encoder Training for Product Matching. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 216–222, Singapore. Association for Computational Linguistics.
Cite (Informal):: Retrieval-Enhanced Dual Encoder Training for Product Matching (Chiu, EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-industry.22.pdf
Video:: https://aclanthology.org/2023.emnlp-industry.22.mp4

PDF Cite Search Video Fix data