Hillel Steinmetz


2024

pdf bib
Debiasing Multi-Entity Aspect-Based Sentiment Analysis with Norm-Based Data Augmentation
Scott Friedman | Joan Zheng | Hillel Steinmetz
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Bias in NLP models may arise from using pre-trained transformer models trained on biased corpora, or by training or fine-tuning directly on corpora with systemic biases. Recent research has explored strategies for reduce measurable biases in NLP predictions while maintaining prediction accuracy on held-out test sets, e.g., by modifying word embedding geometry after training, using purpose-built neural modules for training, or automatically augmenting training data with examples designed to reduce bias. This paper focuses on a debiasing strategy for aspect-based sentiment analysis (ABSA) by augmenting the training data using norm-based language templates derived from previous language resources. We show that the baseline model predicts lower sentiment toward some topics and individuals than others and has relatively high prediction bias (measured by standard deviation), even when the context is held constant. Our results show that our norm-based data augmentation reduces topical bias to less than half while maintaining prediction quality (measured by RMSE), by augmenting the training data by only 1.8%.