Sentiment Analysis of Yelp Review Dataset: A Comparative Study of Machine Learning Methods

Krishna Thakar; Mohamed Abu Sheha; Emmanuel Thompson

Sentiment Analysis of Yelp Review Dataset: A Comparative Study of Machine Learning Methods

Krishna Thakar, Mohamed Abu Sheha, Emmanuel Thompson

Abstract

Sentiment analysis involves analyzing text to determine whether the sentiment expressed is positive, negative, or neutral. In the context of online reviews, such as those on Yelp, sentiment analysis helps businesses assess customer satisfaction and identify areas for improvement. Given the large volume of user-generated content, restaurants often struggle to extract actionable insights from feedback, making sentiment analysis an efficient tool for categorizing reviews and highlighting customer concerns. This study focuses on sentiment analysis of Yelp reviews. The main research question is: How can Natural Language Processing (NLP) combined with statistical machine learning methods be applied to classify sentiment in Yelp reviews and provide actionable insights for improving customer satisfaction, service quality, and business performance? The study used 21,000 Yelp reviews, utilizing NLP approaches - tokenization, stop-word removal, and vectorization. Comparative classification predictive modeling and analysis were done across traditional machine learning (Logistic Regression, Support Vector Machine (SVM), Naïve Bayes, Random Forest), deep learning methods (CNN, LSTM, BiLSTM, GRU, RNN), and an advanced transformer-based (RoBERTa) model. Results showed that RoBERTa outperformed the other candidate methods. These findings highlight the potential of advanced NLP techniques to offer businesses practical ways to address customer complaints, enhance service quality, and drive overall business performance.

Anthology ID:: 2026.acl-srw.38
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 428–437
Language:
URL:: https://aclanthology.org/2026.acl-srw.38/
DOI:
Bibkey:
Cite (ACL):: Krishna Thakar, Mohamed Abu Sheha, and Emmanuel Thompson. 2026. Sentiment Analysis of Yelp Review Dataset: A Comparative Study of Machine Learning Methods. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 428–437, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Sentiment Analysis of Yelp Review Dataset: A Comparative Study of Machine Learning Methods (Thakar et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-srw.38.pdf

PDF Cite Search Fix data