2024
pdf
bib
abs
Cost-Efficient Subjective Task Annotation and Modeling through Few-Shot Annotator Adaptation
Preni Golazizian
|
Alireza Salkhordeh Ziabari
|
Ali Omrani
|
Morteza Dehghani
Findings of the Association for Computational Linguistics: EMNLP 2024
In subjective NLP tasks, where a single ground truth does not exist, the inclusion of diverse annotators becomes crucial as their unique perspectives significantly influence the annotations. In realistic scenarios, the annotation budget often becomes the main determinant of the number of perspectives (i.e., annotators) included in the data and subsequent modeling. We introduce a novel framework for annotation collection and modeling in subjective tasks that aims to minimize the annotation budget while maximizing the predictive performance for each annotator. Our framework has a two-stage design: first, we rely on a small set of annotators to build a multitask model, and second, we augment the model for a new perspective by strategically annotating a few samples per annotator. To test our framework at scale, we introduce and release a unique dataset, Moral Foundations Subjective Corpus, of 2000 Reddit posts annotated by 24 annotators for moral sentiment. We demonstrate that our framework surpasses the previous SOTA in capturing the annotators’ individual perspectives with as little as 25% of the original annotation budget on two datasets. Furthermore, our framework results in more equitable models, reducing the performance disparity among annotators.
pdf
bib
abs
Towards a Unified Framework for Adaptable Problematic Content Detection via Continual Learning
Ali Omrani
|
Alireza Salkhordeh Ziabari
|
Preni Golazizian
|
Jeffrey Sorensen
|
Morteza Dehghani
Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024)
Detecting problematic content, such as hate speech, is a multifaceted and ever-changing task, influenced by social dynamics, user populations, diversity of sources, and evolving language. There has been significant efforts, both in academia and in industry, to develop annotated resources that capture various aspects of problematic content. Due to researchers’ diverse objectives, these annotations are often inconsistent and hence, reports of progress on the detection of problematic content are fragmented. This pattern is expected to persist unless we pool these resources, taking into account the dynamic nature of this issue. In this paper, we propose integrating the available resources, leveraging their dynamic nature to break this pattern, and introduce a continual learning framework and benchmark for problematic content detection. Our benchmark, comprising 84 related tasks, creates a novel measure of progress: prioritizing the adaptability of classifiers to evolving tasks over excelling in specific tasks. To ensure continuous relevance, our benchmark is designed for seamless integration of new tasks. Our results demonstrate that continual learning methods outperform static approaches by up to 17% and 4% AUC in capturing the evolving content and adapting to novel forms of problematic content
pdf
bib
abs
Reinforced Multiple Instance Selection for Speaker Attribute Prediction
Alireza Salkhordeh Ziabari
|
Ali Omrani
|
Parsa Hejabi
|
Preni Golazizian
|
Brendan Kennedy
|
Payam Piray
|
Morteza Dehghani
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Language usage is related to speaker age, gender, moral concerns, political ideology, and other attributes. Current state-of-the-art methods for predicting these attributes take a speaker’s utterances as input and provide a prediction per speaker attribute. Most of these approaches struggle to handle a large number of utterances per speaker. This difficulty is primarily due to the computational constraints of the models. Additionally, only a subset of speaker utterances may be relevant to specific attributes. In this paper, we formulate speaker attribute prediction as a Multiple Instance Learning (MIL) problem and propose RL-MIL, a novel approach based on Reinforcement Learning (RL) that effectively addresses both of these challenges. Our experiments demonstrate that our RL-based methodology consistently outperforms previous approaches across a range of related tasks: predicting speakers’ psychographics and demographics from social media posts, and political ideologies from transcribed speeches. We create synthetic datasets and investigate the behavior of RL-MIL systematically. Our results show the success of RL-MIL in improving speaker attribute prediction by learning to select relevant speaker utterances.
2023
pdf
bib
abs
Social-Group-Agnostic Bias Mitigation via the Stereotype Content Model
Ali Omrani
|
Alireza Salkhordeh Ziabari
|
Charles Yu
|
Preni Golazizian
|
Brendan Kennedy
|
Mohammad Atari
|
Heng Ji
|
Morteza Dehghani
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Existing bias mitigation methods require social-group-specific word pairs (e.g., “man” – “woman”) for each social attribute (e.g., gender), restricting the bias mitigation to only one specified social attribute. Further, this constraint renders such methods impractical and costly for mitigating bias in understudied and/or unmarked social groups. We propose that the Stereotype Content Model (SCM) — a theoretical framework developed in social psychology for understanding the content of stereotyping — can help debiasing efforts to become social-group-agnostic by capturing the underlying connection between bias and stereotypes. SCM proposes that the content of stereotypes map to two psychological dimensions of warmth and competence. Using only pairs of terms for these two dimensions (e.g., warmth: “genuine” – “fake”; competence: “smart” – “stupid”), we perform debiasing with established methods on both pre-trained word embeddings and large language models. We demonstrate that our social-group-agnostic, SCM-based debiasing technique performs comparably to group-specific debiasing on multiple bias benchmarks, but has theoretical and practical advantages over existing approaches.
2020
pdf
bib
abs
Irony Detection in Persian Language: A Transfer Learning Approach Using Emoji Prediction
Preni Golazizian
|
Behnam Sabeti
|
Seyed Arad Ashrafi Asli
|
Zahra Majdabadi
|
Omid Momenzadeh
|
Reza Fahmi
Proceedings of the Twelfth Language Resources and Evaluation Conference
Irony is a linguistic device used to intend an idea while articulating an opposing expression. Many text analytic algorithms used for emotion extraction or sentiment analysis, produce invalid results due to the use of irony. Persian speakers use this device more often due to the language’s nature and some cultural reasons. This phenomenon also appears in social media platforms such as Twitter where users express their opinions using ironic or sarcastic posts. In the current research, which is the first attempt at irony detection in Persian language, emoji prediction is used to build a pretrained model. The model is finetuned utilizing a set of hand labeled tweets with irony tags. A bidirectional LSTM (BiLSTM) network is employed as the basis of our model which is improved by attention mechanism. Additionally, a Persian corpus for irony detection containing 4339 manually-labeled tweets is introduced. Experiments show the proposed approach outperforms the adapted state-of-the-art method tested on Persian dataset with an accuracy of 83.1%, and offers a strong baseline for further research in Persian language.
pdf
bib
abs
Optimizing Annotation Effort Using Active Learning Strategies: A Sentiment Analysis Case Study in Persian
Seyed Arad Ashrafi Asli
|
Behnam Sabeti
|
Zahra Majdabadi
|
Preni Golazizian
|
Reza Fahmi
|
Omid Momenzadeh
Proceedings of the Twelfth Language Resources and Evaluation Conference
Deep learning models are the current State-of-the-art methodologies towards many real-world problems. However, they need a substantial amount of labeled data to be trained appropriately. Acquiring labeled data can be challenging in some particular domains or less-resourced languages. There are some practical solutions regarding these issues, such as Active Learning and Transfer Learning. Active learning’s idea is simple: let the model choose the samples for annotation instead of labeling the whole dataset. This method leads to a more efficient annotation process. Active Learning models can achieve the baseline performance (the accuracy of the model trained on the whole dataset), with a considerably lower amount of labeled data. Several active learning approaches are tested in this work, and their compatibility with Persian is examined using a brand-new sentiment analysis dataset that is also introduced in this work. MirasOpinion, which to our knowledge is the largest Persian sentiment analysis dataset, is crawled from a Persian e-commerce website and annotated using a crowd-sourcing policy. LDA sampling, which is an efficient Active Learning strategy using Topic Modeling, is proposed in this research. Active Learning Strategies have shown promising results in the Persian language, and LDA sampling showed a competitive performance compared to other approaches.
pdf
bib
abs
Twitter Trend Extraction: A Graph-based Approach for Tweet and Hashtag Ranking, Utilizing No-Hashtag Tweets
Zahra Majdabadi
|
Behnam Sabeti
|
Preni Golazizian
|
Seyed Arad Ashrafi Asli
|
Omid Momenzadeh
|
Reza Fahmi
Proceedings of the Twelfth Language Resources and Evaluation Conference
Twitter has become a major platform for users to express their opinions on any topic and engage in debates. User debates and interactions usually lead to massive content regarding a specific topic which is called a Trend. Twitter trend extraction aims at finding these relevant groups of content that are generated in a short period. The most straightforward approach for this problem is using Hashtags, however, tweets without hashtags are not considered this way. In order to overcome this issue and extract trends using all tweets, we propose a graph-based approach where graph nodes represent tweets as well as words and hashtags. More specifically, we propose a modified version of RankClus algorithm to extract trends from the constructed tweets graph. The proposed approach is also capable of ranking tweets, words and hashtags in each trend with respect to their importance and relevance to the topic. The proposed algorithm is used to extract trends from several twitter datasets, where it produced consistent and coherent results.