MICO: Selective Search with Mutual Information Co-training

Zhanyu Wang, Xiao Zhang, Hyokun Yun, Choon Hui Teo, Trishul Chilimbi


Abstract
In contrast to traditional exhaustive search, selective search first clusters documents into several groups before all the documents are searched exhaustively by a query, to limit the search executed within one group or only a few groups. Selective search is designed to reduce the latency and computation in modern large-scale search systems. In this study, we propose MICO, a Mutual Information CO-training framework for selective search with minimal supervision using the search logs. After training, MICO does not only cluster the documents, but also routes unseen queries to the relevant clusters for efficient retrieval. In our empirical experiments, MICO significantly improves the performance on multiple metrics of selective search and outperforms a number of existing competitive baselines.
Anthology ID:
2022.coling-1.102
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
1179–1192
Language:
URL:
https://aclanthology.org/2022.coling-1.102
DOI:
Bibkey:
Cite (ACL):
Zhanyu Wang, Xiao Zhang, Hyokun Yun, Choon Hui Teo, and Trishul Chilimbi. 2022. MICO: Selective Search with Mutual Information Co-training. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1179–1192, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
MICO: Selective Search with Mutual Information Co-training (Wang et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.102.pdf
Code
 aws/selective-search-with-mutual-information-cotraining