A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models

Xingmeng Zhao; Ali Niazi; Anthony Rios

doi:10.18653/v1/2024.naacl-long.245

A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models

Abstract

Chemical named entity recognition (NER) models are used in many downstream tasks, from adverse drug reaction identification to pharmacoepidemiology. However, it is unknown whether these models work the same for everyone. Performance disparities can potentially cause harm rather than the intended good. This paper assesses gender-related performance disparities in chemical NER systems. We develop a framework for measuring gender bias in chemical NER models using synthetic data and a newly annotated corpus of over 92,405 words with self-identified gender information from Reddit. Our evaluation of multiple biomedical NER models reveals evident biases. For instance, synthetic data suggests that female names are frequently misclassified as chemicals, especially when it comes to brand name mentions. Additionally, we observe performance disparities between female- and male-associated data in both datasets. Many systems fail to detect contraceptives such as birth control. Our findings emphasize the biases in chemical NER models, urging practitioners to account for these biases in downstream applications.

Anthology ID:: 2024.naacl-long.245
Volume:: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Kevin Duh, Helena Gomez, Steven Bethard
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4360–4374
Language:
URL:: https://aclanthology.org/2024.naacl-long.245/
DOI:: 10.18653/v1/2024.naacl-long.245
Bibkey:
Cite (ACL):: Xingmeng Zhao, Ali Niazi, and Anthony Rios. 2024. A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 4360–4374, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models (Zhao et al., NAACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.naacl-long.245.pdf
Video:: https://aclanthology.org/2024.naacl-long.245.mp4

PDF Cite Search Video Fix data