Measuring Similarity of Opinion-bearing Sentences

Wenyi Tay, Xiuzhen Zhang, Stephen Wan, Sarvnaz Karimi


Abstract
For many NLP applications of online reviews, comparison of two opinion-bearing sentences is key. We argue that, while general purpose text similarity metrics have been applied for this purpose, there has been limited exploration of their applicability to opinion texts. We address this gap in the literature, studying: (1) how humans judge the similarity of pairs of opinion-bearing sentences; and, (2) the degree to which existing text similarity metrics, particularly embedding-based ones, correspond to human judgments. We crowdsourced annotations for opinion sentence pairs and our main findings are: (1) annotators tend to agree on whether or not opinion sentences are similar or different; and (2) embedding-based metrics capture human judgments of “opinion similarity” but not “opinion difference”. Based on our analysis, we identify areas where the current metrics should be improved. We further propose to learn a similarity metric for opinion similarity via fine-tuning the Sentence-BERT sentence-embedding network based on review text and weak supervision by review ratings. Experiments show that our learned metric outperforms existing text similarity metrics and especially show significantly higher correlations with human annotations for differing opinions.
Anthology ID:
2021.newsum-1.9
Volume:
Proceedings of the Third Workshop on New Frontiers in Summarization
Month:
November
Year:
2021
Address:
Online and in Dominican Republic
Editors:
Giuseppe Carenini, Jackie Chi Kit Cheung, Yue Dong, Fei Liu, Lu Wang
Venue:
NewSum
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
74–84
Language:
URL:
https://aclanthology.org/2021.newsum-1.9
DOI:
10.18653/v1/2021.newsum-1.9
Bibkey:
Cite (ACL):
Wenyi Tay, Xiuzhen Zhang, Stephen Wan, and Sarvnaz Karimi. 2021. Measuring Similarity of Opinion-bearing Sentences. In Proceedings of the Third Workshop on New Frontiers in Summarization, pages 74–84, Online and in Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Measuring Similarity of Opinion-bearing Sentences (Tay et al., NewSum 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.newsum-1.9.pdf
Video:
 https://aclanthology.org/2021.newsum-1.9.mp4
Code
 wenyi-tay/sos