Dong-Kyu Chae

2024

pdf bib abs
Simple Temperature Cool-down in Contrastive Framework for Unsupervised Sentence Representation Learning
Yoo Hyun Jeong | Myeong Soo Han | Dong-Kyu Chae
Findings of the Association for Computational Linguistics: EACL 2024

In this paper, we proposes a simple, tricky method to improve sentence representation of unsupervised contrastive learning. Even though contrastive learning has achieved great performances in both visual representation learning (VRL) and sentence representation learning (SRL) fields, we focus on the fact that there is a gap between characteristics and training dynamics of VRL and SRL. We first examine the role of temperature to bridge the gap between VRL and SRL, and find some temperature-dependent elements in SRL; i.e., a higher temperature causes overfitting of the uniformity while improving the alignment in earlier phase of training. Then, we design a temperature cool-down technique based on this observation, which helps PLMs to be more suitable for contrastive learning via preparation of uniform representation space. Our experimental results on widely-utilized benchmarks demonstrate the effectiveness and extensiblity of our method.

pdf bib abs
Bootstrap Your Own PLM: Boosting Semantic Features of PLMs for Unsuperivsed Contrastive Learning
Yoo Hyun Jeong | Myeong Soo Han | Dong-Kyu Chae
Findings of the Association for Computational Linguistics: EACL 2024

This paper aims to investigate the possibility of exploiting original semantic features of PLMs (pre-trained language models) during contrastive learning in the context of SRL (sentence representation learning). In the context of feature modification, we identified a method called IFM (implicit feature modification), which reduces the tendency of contrastive models for VRL (visual representation learning) to rely on feature-suppressing shortcut solutions. We observed that IFM did not work well for SRL, which may be due to differences between the nature of VRL and SRL. We propose BYOP, which boosts well-represented features, taking the opposite idea of IFM, under the assumption that SimCSE’s dropout-noise-based augmentation may be too simple to modify high-level semantic features, and that the features learned by PLMs are semantically meaningful and should be boosted, rather than removed. Extensive experiments lend credence to the logic of BYOP, which considers the nature of SRL.

Co-authors

Yoo Hyun Jeong 2
Myeong Soo Han 2

Venues

findings2