Going Beyond Sentence Embeddings: A Token-Level Matching Algorithm for Calculating Semantic Textual Similarity

Hongwei Wang, Dong Yu


Abstract
Semantic Textual Similarity (STS) measures the degree to which the underlying semantics of paired sentences are equivalent. State-of-the-art methods for STS task use language models to encode sentences into embeddings. However, these embeddings are limited in representing semantics because they mix all the semantic information together in fixed-length vectors, which are difficult to recover and lack explainability. This paper presents a token-level matching inference algorithm, which can be applied on top of any language model to improve its performance on STS task. Our method calculates pairwise token-level similarity and token matching scores, and then aggregates them with pretrained token weights to produce sentence similarity. Experimental results on seven STS datasets show that our method improves the performance of almost all language models, with up to 12.7% gain in Spearman’s correlation. We also demonstrate that our method is highly explainable and computationally efficient.
Anthology ID:
2023.acl-short.49
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
563–570
Language:
URL:
https://aclanthology.org/2023.acl-short.49
DOI:
10.18653/v1/2023.acl-short.49
Bibkey:
Cite (ACL):
Hongwei Wang and Dong Yu. 2023. Going Beyond Sentence Embeddings: A Token-Level Matching Algorithm for Calculating Semantic Textual Similarity. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 563–570, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Going Beyond Sentence Embeddings: A Token-Level Matching Algorithm for Calculating Semantic Textual Similarity (Wang & Yu, ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-short.49.pdf
Video:
 https://aclanthology.org/2023.acl-short.49.mp4