Learning to Rank Question Answer Pairs with Bilateral Contrastive Data Augmentation

Yang Deng, Wenxuan Zhang, Wai Lam


Abstract
In this work, we propose a novel and easy-to-apply data augmentation strategy, namely Bilateral Generation (BiG), with a contrastive training objective for improving the performance of ranking question answer pairs with existing labeled data. In specific, we synthesize pseudo-positive QA pairs in contrast to the original negative QA pairs with two pre-trained generation models, one for question generation, the other for answer generation, which are fine-tuned on the limited positive QA pairs from the original dataset. With the augmented dataset, we design a contrastive training objective for learning to rank question answer pairs. Experimental results on three benchmark datasets show that our method significantly improves the performance of ranking models by making full use of existing labeled data and can be easily applied to different ranking models.
Anthology ID:
2021.wnut-1.20
Volume:
Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)
Month:
November
Year:
2021
Address:
Online
Venues:
EMNLP | WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
175–181
Language:
URL:
https://aclanthology.org/2021.wnut-1.20
DOI:
10.18653/v1/2021.wnut-1.20
Bibkey:
Cite (ACL):
Yang Deng, Wenxuan Zhang, and Wai Lam. 2021. Learning to Rank Question Answer Pairs with Bilateral Contrastive Data Augmentation. In Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021), pages 175–181, Online. Association for Computational Linguistics.
Cite (Informal):
Learning to Rank Question Answer Pairs with Bilateral Contrastive Data Augmentation (Deng et al., WNUT 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.wnut-1.20.pdf
Data
WikiQA