CO3: Low-resource Contrastive Co-training for Generative Conversational Query Rewrite

Yifei Yuan, Chen Shi, Wang Runze, Liyi Chen, Renjun Hu, Zengming Zhang, Feijun Jiang, Wai Lam


Abstract
Generative query rewrite generates reconstructed query rewrites using the conversation history while rely heavily on gold rewrite pairs that are expensive to obtain. Recently, few-shot learning is gaining increasing popularity for this task, whereas these methods are sensitive to the inherent noise due to limited data size. Besides, both attempts face performance degradation when there exists language style shift between training and testing cases. To this end, we study low-resource generative conversational query rewrite that is robust to both noise and language style shift. The core idea is to utilize massive unlabeled data to make further improvements via a contrastive co-training paradigm. Specifically, we co-train two dual models (namely Rewriter and Simplifier) such that each of them provides extra guidance through pseudo-labeling for enhancing the other in an iterative manner. We also leverage contrastive learning with data augmentation, which enables our model pay more attention on the truly valuable information than the noise. Extensive experiments demonstrate the superiority of our model under both few-shot and zero-shot scenarios. We also verify the better generalization ability of our model when encountering language style shift.
Anthology ID:
2024.lrec-main.301
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
3394–3406
Language:
URL:
https://aclanthology.org/2024.lrec-main.301
DOI:
Bibkey:
Cite (ACL):
Yifei Yuan, Chen Shi, Wang Runze, Liyi Chen, Renjun Hu, Zengming Zhang, Feijun Jiang, and Wai Lam. 2024. CO3: Low-resource Contrastive Co-training for Generative Conversational Query Rewrite. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 3394–3406, Torino, Italia. ELRA and ICCL.
Cite (Informal):
CO3: Low-resource Contrastive Co-training for Generative Conversational Query Rewrite (Yuan et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.301.pdf