Space Decomposition for Sentence Embedding

Wuttikorn Ponwitayarat, Peerat Limkonchotiwat, Ekapol Chuangsuwanich, Sarana Nutanong


Abstract
Determining sentence pair similarity is crucial for various NLP tasks. A common technique to address this is typically evaluated on a continuous semantic textual similarity scale from 0 to 5. However, based on a linguistic observation in STS annotation guidelines, we found that the score in the range [4,5] indicates an upper-range sample, while the rest are lower-range samples. This necessitates a new approach to treating the upper-range and lower-range classes separately. In this paper, we introduce a novel embedding space decomposition method called MixSP utilizing a Mixture of Specialized Projectors, designed to distinguish and rank upper-range and lower-range samples accurately. The experimental results demonstrate that MixSP decreased the overlap representation between upper-range and lower-range classes significantly while outperforming competitors on STS and zero-shot benchmarks.
Anthology ID:
2024.findings-acl.668
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11227–11239
Language:
URL:
https://aclanthology.org/2024.findings-acl.668
DOI:
Bibkey:
Cite (ACL):
Wuttikorn Ponwitayarat, Peerat Limkonchotiwat, Ekapol Chuangsuwanich, and Sarana Nutanong. 2024. Space Decomposition for Sentence Embedding. In Findings of the Association for Computational Linguistics ACL 2024, pages 11227–11239, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Space Decomposition for Sentence Embedding (Ponwitayarat et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.668.pdf