Chen-Yu Hsu
2024
Unsupervised Multilingual Dense Retrieval via Generative Pseudo Labeling
Chao-Wei Huang
|
Chen-An Li
|
Tsu-Yuan Hsu
|
Chen-Yu Hsu
|
Yun-Nung Chen
Findings of the Association for Computational Linguistics: EACL 2024
2023
CONVERSER: Few-shot Conversational Dense Retrieval with Synthetic Data Generation
Chao-Wei Huang
|
Chen-Yu Hsu
|
Tsu-Yuan Hsu
|
Chen-An Li
|
Yun-Nung Chen
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Conversational search provides a natural interface for information retrieval (IR). Recent approaches have demonstrated promising results in applying dense retrieval to conversational IR. However, training dense retrievers requires large amounts of in-domain paired data. This hinders the development of conversational dense retrievers, as abundant in-domain conversations are expensive to collect. In this paper, we propose Converser, a framework for training conversational dense retrievers with at most 6 examples of in-domain dialogues. Specifically, we utilize the in-context learning capability of large language models to generate conversational queries given a passage in the retrieval corpus. Experimental results on conversational retrieval benchmarks OR-QuAC and TREC CAsT 19 show that the proposed Converser achieves comparable performance to fully-supervised models, demonstrating the effectiveness of our proposed framework in few-shot conversational dense retrieval. All source code and generated datasets are available: https://github.com/MiuLab/CONVERSER