Bayes at FigLang 2022 Euphemism Detection shared task: Cost-Sensitive Bayesian Fine-tuning and Venn-Abers Predictors for Robust Training under Class Skewed Distributions

Paul Trust; Kadusabe Provia; Kizito Omala

doi:10.18653/v1/2022.flp-1.13

Bayes at FigLang 2022 Euphemism Detection shared task: Cost-Sensitive Bayesian Fine-tuning and Venn-Abers Predictors for Robust Training under Class Skewed Distributions

Paul Trust, Kadusabe Provia, Kizito Omala

Abstract

Transformers have achieved a state of the art performance across most natural language processing tasks. However the performance of these models degrade when being trained on skewed class distributions (class imbalance) because training tends to be biased towards head classes with most of the data points . Classical methods that have been proposed to handle this problem (re-sampling and re-weighting) often suffer from unstable performance, poor applicability and poor calibration. In this paper, we propose to use Bayesian methods and Venn-Abers predictors for well calibrated and robust training against class imbalance. Our proposed approach improves f1-score of the baseline RoBERTa (A Robustly Optimized Bidirectional Embedding from Transformers Pretraining Approach) model by about 6 points (79.0% against 72.6%) when training with class imbalanced data.

Anthology ID:: 2022.flp-1.13
Volume:: Proceedings of the 3rd Workshop on Figurative Language Processing (FLP)
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates (Hybrid)
Editors:: Debanjan Ghosh, Beata Beigman Klebanov, Smaranda Muresan, Anna Feldman, Soujanya Poria, Tuhin Chakrabarty
Venue:: Fig-Lang
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 94–99
Language:
URL:: https://aclanthology.org/2022.flp-1.13/
DOI:: 10.18653/v1/2022.flp-1.13
Bibkey:
Cite (ACL):: Paul Trust, Kadusabe Provia, and Kizito Omala. 2022. Bayes at FigLang 2022 Euphemism Detection shared task: Cost-Sensitive Bayesian Fine-tuning and Venn-Abers Predictors for Robust Training under Class Skewed Distributions. In Proceedings of the 3rd Workshop on Figurative Language Processing (FLP), pages 94–99, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):: Bayes at FigLang 2022 Euphemism Detection shared task: Cost-Sensitive Bayesian Fine-tuning and Venn-Abers Predictors for Robust Training under Class Skewed Distributions (Trust et al., Fig-Lang 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.flp-1.13.pdf

PDF Cite Search Fix data