Debiasing Multi-Entity Aspect-Based Sentiment Analysis with Norm-Based Data Augmentation

Scott Friedman, Joan Zheng, Hillel Steinmetz


Abstract
Bias in NLP models may arise from using pre-trained transformer models trained on biased corpora, or by training or fine-tuning directly on corpora with systemic biases. Recent research has explored strategies for reduce measurable biases in NLP predictions while maintaining prediction accuracy on held-out test sets, e.g., by modifying word embedding geometry after training, using purpose-built neural modules for training, or automatically augmenting training data with examples designed to reduce bias. This paper focuses on a debiasing strategy for aspect-based sentiment analysis (ABSA) by augmenting the training data using norm-based language templates derived from previous language resources. We show that the baseline model predicts lower sentiment toward some topics and individuals than others and has relatively high prediction bias (measured by standard deviation), even when the context is held constant. Our results show that our norm-based data augmentation reduces topical bias to less than half while maintaining prediction quality (measured by RMSE), by augmenting the training data by only 1.8%.
Anthology ID:
2024.lrec-main.398
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
4456–4461
Language:
URL:
https://aclanthology.org/2024.lrec-main.398
DOI:
Bibkey:
Cite (ACL):
Scott Friedman, Joan Zheng, and Hillel Steinmetz. 2024. Debiasing Multi-Entity Aspect-Based Sentiment Analysis with Norm-Based Data Augmentation. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 4456–4461, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Debiasing Multi-Entity Aspect-Based Sentiment Analysis with Norm-Based Data Augmentation (Friedman et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.398.pdf