A Transparent Framework for Evaluating Unintended Demographic Bias in Word Embeddings

Chris Sweeney, Maryam Najafian


Abstract
Word embedding models have gained a lot of traction in the Natural Language Processing community, however, they suffer from unintended demographic biases. Most approaches to evaluate these biases rely on vector space based metrics like the Word Embedding Association Test (WEAT). While these approaches offer great geometric insights into unintended biases in the embedding vector space, they fail to offer an interpretable meaning for how the embeddings could cause discrimination in downstream NLP applications. In this work, we present a transparent framework and metric for evaluating discrimination across protected groups with respect to their word embedding bias. Our metric (Relative Negative Sentiment Bias, RNSB) measures fairness in word embeddings via the relative negative sentiment associated with demographic identity terms from various protected groups. We show that our framework and metric enable useful analysis into the bias in word embeddings.
Anthology ID:
P19-1162
Original:
P19-1162v1
Version 2:
P19-1162v2
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1662–1667
Language:
URL:
https://aclanthology.org/P19-1162
DOI:
10.18653/v1/P19-1162
Bibkey:
Cite (ACL):
Chris Sweeney and Maryam Najafian. 2019. A Transparent Framework for Evaluating Unintended Demographic Bias in Word Embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1662–1667, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
A Transparent Framework for Evaluating Unintended Demographic Bias in Word Embeddings (Sweeney & Najafian, ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1162.pdf
Video:
 https://aclanthology.org/P19-1162.mp4
Data
ConceptNet