Fairness-Aware Online Positive-Unlabeled Learning

Hoin Jung; Xiaoqian Wang

doi:10.18653/v1/2024.emnlp-industry.14

Fairness-Aware Online Positive-Unlabeled Learning

Abstract

Machine learning applications for text classification are increasingly used in domains such as toxicity and misinformation detection in online settings. However, obtaining precisely labeled data for training remains challenging, particularly because not all problematic instances are reported. Positive-Unlabeled (PU) learning, which uses only labeled positive and unlabeled samples, offers a solution for these scenarios. A significant concern in PU learning, especially in online settings, is fairness: specific groups may be disproportionately classified as problematic. Despite its importance, this issue has not been explicitly addressed in research. This paper aims to bridge this gap by investigating the fairness of PU learning in both offline and online settings. We propose a novel approach to achieve more equitable results by extending PU learning methods to online learning for both linear and non-linear classifiers and analyzing the impact of the online setting on fairness. Our approach incorporates a convex fairness constraint during training, applicable to both offline and online PU learning. Our solution is theoretically robust, and experimental results demonstrate its efficacy in improving fairness in PU learning in text classification.

Anthology ID:: 2024.emnlp-industry.14
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:: November
Year:: 2024
Address:: Miami, Florida, US
Editors:: Franck Dernoncourt, Daniel Preoţiuc-Pietro, Anastasia Shimorina
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 170–185
Language:
URL:: https://aclanthology.org/2024.emnlp-industry.14/
DOI:: 10.18653/v1/2024.emnlp-industry.14
Bibkey:
Cite (ACL):: Hoin Jung and Xiaoqian Wang. 2024. Fairness-Aware Online Positive-Unlabeled Learning. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 170–185, Miami, Florida, US. Association for Computational Linguistics.
Cite (Informal):: Fairness-Aware Online Positive-Unlabeled Learning (Jung & Wang, EMNLP 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.emnlp-industry.14.pdf
Software:: 2024.emnlp-industry.14.software.zip
Poster:: 2024.emnlp-industry.14.poster.pdf
Presentation:: 2024.emnlp-industry.14.presentation.pdf
Video:: 2024.emnlp-industry.14.video.mp4

PDF Cite Search Software Poster Presentation Video Fix data