SumCSE: Summary as a transformation for Contrastive Learning

Raghuveer Thirukovalluru, Xiaolan Wang, Jun Chen, Shuyang Li, Jie Lei, Rong Jin, Bhuwan Dhingra


Abstract
Sentence embedding models are typically trained using contrastive learning (CL), either using human annotations directly or by repurposing other annotated datasets. In this work, we explore the recently introduced paradigm of generating CL data using generative language models (LM). In CL for computer vision (CV), compositional transformations (series of operations applied over an image. e.g. cropping + color distortion) which modify the input/image to retain minimal information were shown to be very effective. We show that composition of a ‘Summary’ transformation with diverse paraphrasing/contradicting transformations accomplishes the same and works very well in CL for sentence embeddings. Our final generated dataset (using Vicuna-13B) significantly outperforms the previous best unsupervised method (using ChatGPT) by 1.8 points, and SimCSE, a strong supervised baseline by 0.3 points on the semantic text similarity (STS) benchmark.
Anthology ID:
2024.findings-naacl.227
Volume:
Findings of the Association for Computational Linguistics: NAACL 2024
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3577–3588
Language:
URL:
https://aclanthology.org/2024.findings-naacl.227
DOI:
10.18653/v1/2024.findings-naacl.227
Bibkey:
Cite (ACL):
Raghuveer Thirukovalluru, Xiaolan Wang, Jun Chen, Shuyang Li, Jie Lei, Rong Jin, and Bhuwan Dhingra. 2024. SumCSE: Summary as a transformation for Contrastive Learning. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 3577–3588, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
SumCSE: Summary as a transformation for Contrastive Learning (Thirukovalluru et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-naacl.227.pdf