2023
pdf
bib
abs
Multi-task Ensemble Learning for Fake Reviews Detection and Helpfulness Prediction: A Novel Approach
Alimuddin Melleng
|
Anna Jurek-Loughrey
|
Deepak P
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Research on fake reviews detection and review helpfulness prediction is prevalent, yet most studies tend to focus solely on either fake reviews detection or review helpfulness prediction, considering them separate research tasks. In contrast to this prevailing pattern, we address both challenges concurrently by employing a multi-task learning approach. We posit that undertaking these tasks simultaneously can enhance the performance of each task through shared information among features. We utilize pre-trained RoBERTa embeddings with a document-level data representation. This is coupled with an array of deep learning and neural network models, including Bi-LSTM, LSTM, GRU, and CNN. Additionally, we em- ploy ensemble learning techniques to integrate these models, with the objective of enhancing overall prediction accuracy and mitigating the risk of overfitting. The findings of this study offer valuable insights to the fields of natural language processing and machine learning and present a novel perspective on leveraging multi-task learning for the twin challenges of fake reviews detection and review helpfulness prediction
pdf
bib
abs
Data Fusion for Better Fake Reviews Detection
Alimuddin Melleng
|
Anna Jurek-Loughrey
|
Deepak P
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Online reviews have become critical in informing purchasing decisions, making the detection of fake reviews a crucial challenge to tackle. Many different Machine Learning based solutions have been proposed, using various data representations such as n-grams or document embeddings. In this paper, we first explore the effectiveness of different data representations, including emotion, document embedding, n-grams, and noun phrases in embedding for mat, for fake reviews detection. We evaluate these representations with various state-of-the-art deep learning models, such as BILSTM, LSTM, GRU, CNN, and MLP. Following this, we propose to incorporate different data repre- sentations and classification models using early and late data fusion techniques in order to im- prove the prediction performance. The experiments are conducted on four datasets: Hotel, Restaurant, Amazon, and Yelp. The results demonstrate that combination of different data representations significantly outperform any of the single data representations
pdf
bib
abs
Multiple Evidence Combination for Fact-Checking of Health-Related Information
Pritam Deka
|
Anna Jurek-Loughrey
|
Deepak P
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
Fact-checking of health-related claims has become necessary in this digital age, where any information posted online is easily available to everyone. The most effective way to verify such claims is by using evidences obtained from reliable sources of medical knowledge, such as PubMed. Recent advances in the field of NLP have helped automate such fact-checking tasks. In this work, we propose a domain-specific BERT-based model using a transfer learning approach for the task of predicting the veracity of claim-evidence pairs for the verification of health-related facts. We also improvise on a method to combine multiple evidences retrieved for a single claim, taking into consideration conflicting evidences as well. We also show how our model can be exploited when labelled data is available and how back-translation can be used to augment data when there is data scarcity.
2021
pdf
bib
abs
Ranking Online Reviews Based on Their Helpfulness: An Unsupervised Approach
Alimuddin Melleng
|
Anna Jurek-Loughrey
|
Deepak P
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Online reviews are an essential aspect of online shopping for both customers and retailers. However, many reviews found on the Internet lack in quality, informativeness or helpfulness. In many cases, they lead the customers towards positive or negative opinions without providing any concrete details (e.g., very poor product, I would not recommend it). In this work, we propose a novel unsupervised method for quantifying helpfulness leveraging the availability of a corpus of reviews. In particular, our method exploits three characteristics of the reviews, viz., relevance, emotional intensity and specificity, towards quantifying helpfulness. We perform three rankings (one for each feature above), which are then combined to obtain a final helpfulness ranking. For the purpose of empirically evaluating our method, we use review of four product categories from Amazon review. The experimental evaluation demonstrates the effectiveness of our method in comparison to a recent and state-of-the-art baseline.
2020
pdf
bib
abs
Does History Matter? Using Narrative Context to Predict the Trajectory of Sentence Sentiment
Liam Watson
|
Anna Jurek-Loughrey
|
Barry Devereux
|
Brian Murphy
Proceedings of the Second Workshop on Linguistic and Neurocognitive Resources
While there is a rich literature on the tracking of sentiment and emotion in texts, modelling the emotional trajectory of longer narratives, such as literary texts, poses new challenges. Previous work in the area of sentiment analysis has focused on using information from within a sentence to predict a valence value for that sentence. We propose to explore the influence of previous sentences on the sentiment of a given sentence. In particular, we investigate whether information present in a history of previous sentences can be used to predict a valence value for the following sentence. We explored both linear and non-linear models applied with a range of different feature combinations. We also looked at different context history sizes to determine what range of previous sentence context was the most informative for our models. We establish a linear relationship between sentence context history and the valence value of the current sentence and demonstrate that sentences in closer proximity to the target sentence are more informative. We show that the inclusion of semantic word embeddings further enriches our model predictions.
2019
pdf
bib
abs
Sentiment and Emotion Based Representations for Fake Reviews Detection
Alimuddin Melleng
|
Anna Jurek-Loughrey
|
Deepak P
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
Fake reviews are increasingly prevalent across the Internet. They can be unethical as well as harmful. They can affect businesses and mislead individual customers. As the opinions on the Web are increasingly used the detection of fake reviews has become more and more critical. In this study, we explore the effectiveness of sentiment and emotions based representations for the task of building machine learning models for fake review detection. We perform empirical studies over three real world datasets and demonstrate that improved data representation can be achieved by combining sentiment and emotion extraction methods, as well as by performing sentiment and emotion analysis on a part-by-part basis by segmenting the reviews.