Toward Interpretable Semantic Textual Similarity via Optimal Transport-based Contrastive Sentence Learning

Seonghyeon Lee, Dongha Lee, Seongbo Jang, Hwanjo Yu


Abstract
Recently, finetuning a pretrained language model to capture the similarity between sentence embeddings has shown the state-of-the-art performance on the semantic textual similarity (STS) task. However, the absence of an interpretation method for the sentence similarity makes it difficult to explain the model output. In this work, we explicitly describe the sentence distance as the weighted sum of contextualized token distances on the basis of a transportation problem, and then present the optimal transport-based distance measure, named RCMD; it identifies and leverages semantically-aligned token pairs. In the end, we propose CLRCMD, a contrastive learning framework that optimizes RCMD of sentence pairs, which enhances the quality of sentence similarity and their interpretation. Extensive experiments demonstrate that our learning framework outperforms other baselines on both STS and interpretable-STS benchmarks, indicating that it computes effective sentence similarity and also provides interpretation consistent with human judgement.
Anthology ID:
2022.acl-long.412
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5969–5979
Language:
URL:
https://aclanthology.org/2022.acl-long.412
DOI:
10.18653/v1/2022.acl-long.412
Bibkey:
Cite (ACL):
Seonghyeon Lee, Dongha Lee, Seongbo Jang, and Hwanjo Yu. 2022. Toward Interpretable Semantic Textual Similarity via Optimal Transport-based Contrastive Sentence Learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5969–5979, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Toward Interpretable Semantic Textual Similarity via Optimal Transport-based Contrastive Sentence Learning (Lee et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.412.pdf
Software:
 2022.acl-long.412.software.zip
Video:
 https://aclanthology.org/2022.acl-long.412.mp4
Code
 sh0416/clrcmd
Data
MultiNLISNLI