Data Fusion for Better Fake Reviews Detection

Alimuddin Melleng, Anna Jurek-Loughrey, Deepak P


Abstract
Online reviews have become critical in informing purchasing decisions, making the detection of fake reviews a crucial challenge to tackle. Many different Machine Learning based solutions have been proposed, using various data representations such as n-grams or document embeddings. In this paper, we first explore the effectiveness of different data representations, including emotion, document embedding, n-grams, and noun phrases in embedding for mat, for fake reviews detection. We evaluate these representations with various state-of-the-art deep learning models, such as BILSTM, LSTM, GRU, CNN, and MLP. Following this, we propose to incorporate different data repre- sentations and classification models using early and late data fusion techniques in order to im- prove the prediction performance. The experiments are conducted on four datasets: Hotel, Restaurant, Amazon, and Yelp. The results demonstrate that combination of different data representations significantly outperform any of the single data representations
Anthology ID:
2023.ranlp-1.79
Volume:
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
730–738
Language:
URL:
https://aclanthology.org/2023.ranlp-1.79
DOI:
Bibkey:
Cite (ACL):
Alimuddin Melleng, Anna Jurek-Loughrey, and Deepak P. 2023. Data Fusion for Better Fake Reviews Detection. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 730–738, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Data Fusion for Better Fake Reviews Detection (Melleng et al., RANLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.ranlp-1.79.pdf