Towards Generalizeable Semantic Product Search by Text Similarity Pre-training on Search Click Logs

Zheng Liu, Wei Zhang, Yan Chen, Weiyi Sun, Tianchuan Du, Benjamin Schroeder


Abstract
Recently, semantic search has been successfully applied to E-commerce product search and the learned semantic space for query and product encoding are expected to generalize well to unseen queries or products. Yet, whether generalization can conveniently emerge has not been thoroughly studied in the domain thus far. In this paper, we examine several general-domain and domain-specific pre-trained Roberta variants and discover that general-domain fine-tuning does not really help generalization which aligns with the discovery of prior art, yet proper domain-specific fine-tuning with clickstream data can lead to better model generalization, based on a bucketed analysis of a manually annotated query-product relevance data.
Anthology ID:
2022.ecnlp-1.26
Volume:
Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Shervin Malmasi, Oleg Rokhlenko, Nicola Ueffing, Ido Guy, Eugene Agichtein, Surya Kallumadi
Venue:
ECNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
224–233
Language:
URL:
https://aclanthology.org/2022.ecnlp-1.26
DOI:
10.18653/v1/2022.ecnlp-1.26
Bibkey:
Cite (ACL):
Zheng Liu, Wei Zhang, Yan Chen, Weiyi Sun, Tianchuan Du, and Benjamin Schroeder. 2022. Towards Generalizeable Semantic Product Search by Text Similarity Pre-training on Search Click Logs. In Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5), pages 224–233, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Towards Generalizeable Semantic Product Search by Text Similarity Pre-training on Search Click Logs (Liu et al., ECNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.ecnlp-1.26.pdf
Video:
 https://aclanthology.org/2022.ecnlp-1.26.mp4
Data
WANDS