The Role of Protected Class Word Lists in Bias Identification of Contextualized Word Representations

João Sedoc, Lyle Ungar


Abstract
Systemic bias in word embeddings has been widely reported and studied, and efforts made to debias them; however, new contextualized embeddings such as ELMo and BERT are only now being similarly studied. Standard debiasing methods require heterogeneous lists of target words to identify the “bias subspace”. We show show that using new contextualized word embeddings in conceptor debiasing allows us to more accurately debias word embeddings by breaking target word lists into more homogeneous subsets and then combining (”Or’ing”) the debiasing conceptors of the different subsets.
Anthology ID:
W19-3808
Volume:
Proceedings of the First Workshop on Gender Bias in Natural Language Processing
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Marta R. Costa-jussà, Christian Hardmeier, Will Radford, Kellie Webster
Venue:
GeBNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
55–61
Language:
URL:
https://aclanthology.org/W19-3808
DOI:
10.18653/v1/W19-3808
Bibkey:
Cite (ACL):
João Sedoc and Lyle Ungar. 2019. The Role of Protected Class Word Lists in Bias Identification of Contextualized Word Representations. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing, pages 55–61, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
The Role of Protected Class Word Lists in Bias Identification of Contextualized Word Representations (Sedoc & Ungar, GeBNLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-3808.pdf