Efficient and Interpretable Information Retrieval for Product Question Answering with Heterogeneous Data

Biplob Biswas, Rajiv Ramnath


Abstract
Expansion-enhanced sparse lexical representation improves information retrieval (IR) by minimizing vocabulary mismatch problems during lexical matching. In this paper, we explore the potential of jointly learning dense semantic representation and combining it with the lexical one for ranking candidate information. We present a hybrid information retrieval mechanism that maximizes lexical and semantic matching while minimizing their shortcomings. Our architecture consists of dual hybrid encoders that independently encode queries and information elements. Each encoder jointly learns a dense semantic representation and a sparse lexical representation augmented by a learnable term expansion of the corresponding text through contrastive learning. We demonstrate the efficacy of our model in single-stage ranking of a benchmark product question-answering dataset containing the typical heterogeneous information available on online product pages. Our evaluation demonstrates that our hybrid approach outperforms independently trained retrievers by 10.95% (sparse) and 2.7% (dense) in MRR@5 score. Moreover, our model offers better interpretability and performs comparably to state-of-the-art cross-encoders while reducing response time by 30% (latency) and cutting computational load by approximately 38% (FLOPs).
Anthology ID:
2024.ecnlp-1.3
Volume:
Proceedings of the Seventh Workshop on e-Commerce and NLP @ LREC-COLING 2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Shervin Malmasi, Besnik Fetahu, Nicola Ueffing, Oleg Rokhlenko, Eugene Agichtein, Ido Guy
Venues:
ECNLP | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
19–28
Language:
URL:
https://aclanthology.org/2024.ecnlp-1.3
DOI:
Bibkey:
Cite (ACL):
Biplob Biswas and Rajiv Ramnath. 2024. Efficient and Interpretable Information Retrieval for Product Question Answering with Heterogeneous Data. In Proceedings of the Seventh Workshop on e-Commerce and NLP @ LREC-COLING 2024, pages 19–28, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Efficient and Interpretable Information Retrieval for Product Question Answering with Heterogeneous Data (Biswas & Ramnath, ECNLP-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.ecnlp-1.3.pdf