SKICSE: Sentence Knowable Information Prompted by LLMs Improves Contrastive Sentence Embeddings

Fangwei Ou; Jinan Xu

doi:10.18653/v1/2024.naacl-short.13

SKICSE: Sentence Knowable Information Prompted by LLMs Improves Contrastive Sentence Embeddings

Abstract

Contrastive learning, which utilizes positive pairs and in-batch negatives to optimize the loss objective, has been proven to be an effective method for learning sentence embeddings. However, we argue that the previous methods of constructing positive pairs only through dropout perturbation or entailment relation are limited. Since there is more sentence knowable information (SKI) to be mined, such as sentence external knowledge, semantic analysis, and grammatical description. In this work, we first hand-craft a simple and effective prompt template that is able to obtain the knowable information of input sentences from LLMs (e.g., LLaMA). Then we combine the original sentence and its knowable information to form a positive pair for contrastive learning. We evaluate our method on standard semantic textual similarity (STS) tasks. Experimental results show that our unsupervised and supervised models using BERTbase achieve an average of 78.65% and 82.45% Spearman’s correlation respectively, a 2.40% and 0.88% improvement compared to SimCSE. Our model outperforms the previous state-of-the-art model PromptBERT in both unsupervised and supervised settings and specifically yields a new state-of-the-art performance in supervised setting.

Anthology ID:: 2024.naacl-short.13
Volume:: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Kevin Duh, Helena Gomez, Steven Bethard
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 141–146
Language:
URL:: https://aclanthology.org/2024.naacl-short.13
DOI:: 10.18653/v1/2024.naacl-short.13
Bibkey:
Cite (ACL):: Fangwei Ou and Jinan Xu. 2024. SKICSE: Sentence Knowable Information Prompted by LLMs Improves Contrastive Sentence Embeddings. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers), pages 141–146, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: SKICSE: Sentence Knowable Information Prompted by LLMs Improves Contrastive Sentence Embeddings (Ou & Xu, NAACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.naacl-short.13.pdf

PDF Cite Search