SimsterQ: A Similarity based Clustering Approach to Opinion Question Answering

Aishwarya Ashok, Ganapathy Natarajan, Ramez Elmasri, Laurel Smith-Stvan


Abstract
In recent years, there has been an increase in online shopping resulting in an increased number of online reviews. Customers cannot delve into the huge amount of data when they are looking for specific aspects of a product. Some of these aspects can be extracted from the product reviews. In this paper we introduced SimsterQ - a clustering based system for answering questions that makes use of word vectors. Clustering was performed using cosine similarity scores between sentence vectors of reviews and questions. Two variants (Sim and Median) with and without stopwords were evaluated against traditional methods that use term frequency. We also used an n-gram approach to study the effect of noise. We used the reviews in the Amazon Reviews dataset to pick the answers. Evaluation was performed both at the individual sentence level using the top sentence from Okapi BM25 as the gold standard and at the whole answer level using review snippets as the gold standard. At the sentence level our system performed slightly better than a more complicated deep learning method. Our system returned answers similar to the review snippets from the Amazon QA Dataset as measured by the cosine similarity. Analysis was also performed on the quality of the clusters generated by our system.
Anthology ID:
2020.ecnlp-1.11
Volume:
Proceedings of the 3rd Workshop on e-Commerce and NLP
Month:
July
Year:
2020
Address:
Seattle, WA, USA
Venue:
ECNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
69–76
Language:
URL:
https://aclanthology.org/2020.ecnlp-1.11
DOI:
10.18653/v1/2020.ecnlp-1.11
Bibkey:
Cite (ACL):
Aishwarya Ashok, Ganapathy Natarajan, Ramez Elmasri, and Laurel Smith-Stvan. 2020. SimsterQ: A Similarity based Clustering Approach to Opinion Question Answering. In Proceedings of the 3rd Workshop on e-Commerce and NLP, pages 69–76, Seattle, WA, USA. Association for Computational Linguistics.
Cite (Informal):
SimsterQ: A Similarity based Clustering Approach to Opinion Question Answering (Ashok et al., ECNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.ecnlp-1.11.pdf