2023
pdf
bib
abs
Unlocking Emotions in Text: A Fusion of Word Embeddings and Lexical Knowledge for Emotion Classification
Anjali Bhardwaj
|
Nesar Ahmad Wasi
|
Muhammad Abulaish
Proceedings of the 20th International Conference on Natural Language Processing (ICON)
This paper introduces an improved method for emotion classification through the integration of emotion lexicons and pre-trained word embeddings. The proposed method utilizes semantically similar features to reconcile the semantic gap between words and emotions. The proposed approach is compared against three baselines for predicting Ekman’s emotions at the document level on the GoEmotions dataset. The effectiveness of the proposed approach is assessed using standard evaluation metrics, which show at least a 5% gain in performance over baselines.
pdf
bib
An Emotion-Enriched and Psycholinguistics Features-Based Approach for Rumor Detection on Online Social Media
Asimul Haque
|
Muhammad Abulaish
Proceedings of the 11th International Workshop on Natural Language Processing for Social Media
2022
pdf
bib
A Graph-Based Approach Leveraging Posts and Reactions for Detecting Rumors on Online Social Media
Asimul Haque
|
Muhammad Abulaish
Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation
pdf
bib
abs
DeepADA:An Attention-Based Deep Learning Framework for Augmenting Imbalanced Textual Datasets
Amit Sah
|
Muhammad Abulaish
Proceedings of the 19th International Conference on Natural Language Processing (ICON)
In this paper, we present an attention-based deep learning framework, DeepADA, which uses data augmentation to address the class imbalance problem in textual datasets. The proposed framework carries out the following functions:(i) using MPNET-based embeddings to extract keywords out of documents from the minority class, (ii) making use of a CNN-BiLSTM architecture with parallel attention to learn the important contextual words associated with the minority class documents’ keywords and provide them with word-level characteristics derived from their statistical and semantic features, (iii) using MPNET, replacing the key contextual terms derived from the oversampled documents that match to a keyword with the contextual term that best fits the context, and finally (iv) oversampling the minority class dataset to produce a balanced dataset. Using a 2-layer stacked BiLSTM classifier, we assess the efficacy of the proposed framework using the original and oversampled versions of three Amazon’s reviews datasets. We contrast the proposed data augmentation approach with two state-of-the-art text data augmentation methods. The experimental results reveal that our method produces an oversampled dataset that is more useful and helps the classifier perform better than the other two state-of-the-art methods. Nevertheless, we discover that the oversampled datasets outperformed their original ones by a wide margin.
pdf
bib
abs
Compact Residual Learning with Frequency-Based Non-Square Kernels for Small Footprint Keyword Spotting
Muhammad Abulaish
|
Rahul Gulia
Proceedings of the 19th International Conference on Natural Language Processing (ICON)
Enabling voice assistants on small embedded devices requires a keyword spotter with a smaller model size and adequate accuracy. It becomes difficult to achieve a reasonable trade-off between a small footprint and high accuracy. Recent studies have demonstrated that convolution neural networks are also effective in the audio domain. In this paper, taking into account the nature of spectrograms, we propose a compact ResNet architecture that uses frequency-based non-square kernels to extract the maximum number of timbral features for keyword spotting. The proposed architecture is approximately three-and-a-half times smaller than a comparable architecture with conventional square kernels. On the Google’s speech command dataset v1, it outperforms both Google’s convolution neural networks and the equivalent ResNet architecture with square kernels. By implementing non-square kernels for spectrogram-related data, we can achieve a significant increase in accuracy with relatively few parameters, as compared to the conventional square kernels that are the default choice for every problem.
2019
pdf
bib
abs
An LSTM-Based Deep Learning Approach for Detecting Self-Deprecating Sarcasm in Textual Data
Ashraf Kamal
|
Muhammad Abulaish
Proceedings of the 16th International Conference on Natural Language Processing
Self-deprecating sarcasm is a special category of sarcasm, which is nowadays popular and useful for many real-life applications, such as brand endorsement, product campaign, digital marketing, and advertisement. The self-deprecating style of campaign and marketing strategy is mainly adopted to excel brand endorsement and product sales value. In this paper, we propose an LSTM-based deep learning approach for detecting self-deprecating sarcasm in textual data. To the best of our knowledge, there is no prior work related to self-deprecating sarcasm detection using deep learning techniques. Starting with a filtering step to identify self-referential tweets, the proposed approach adopts a deep learning model using LSTM for detecting self-deprecating sarcasm. The proposed approach is evaluated over three Twitter datasets and performs significantly better in terms of precision, recall, and f-score.
pdf
bib
abs
DRCoVe: An Augmented Word Representation Approach using Distributional and Relational Context
Md. Aslam Parwez
|
Muhammad Abulaish
|
Mohd Fazil
Proceedings of the 16th International Conference on Natural Language Processing
Word representation using the distributional information of words from a sizeable corpus is considered efficacious in many natural language processing and text mining applications. However, distributional representation of a word is unable to capture distant relational knowledge, representing the relational semantics. In this paper, we propose a novel word representation approach using distributional and relational contexts, DRCoVe, which augments the distributional representation of a word using the relational semantics extracted as syntactic and semantic association among entities from the underlying corpus. Unlike existing approaches that use external knowledge bases representing the relational semantics for enhanced word representation, DRCoVe uses typed dependencies (aka syntactic dependencies) to extract relational knowledge from the underlying corpus. The proposed approach is applied over a biomedical text corpus to learn word representation and compared with GloVe, which is one of the most popular word embedding approaches. The evaluation results on various benchmark datasets for word similarity and word categorization tasks demonstrate the effectiveness of DRCoVe over the GloVe.