On Measuring Social Biases in Sentence Encoders

Chandler May, Alex Wang, Shikha Bordia, Samuel R. Bowman, Rachel Rudinger


Abstract
The Word Embedding Association Test shows that GloVe and word2vec word embeddings exhibit human-like implicit biases based on gender, race, and other social constructs (Caliskan et al., 2017). Meanwhile, research on learning reusable text representations has begun to explore sentence-level texts, with some sentence encoders seeing enthusiastic adoption. Accordingly, we extend the Word Embedding Association Test to measure bias in sentence encoders. We then test several sentence encoders, including state-of-the-art methods such as ELMo and BERT, for the social biases studied in prior work and two important biases that are difficult or impossible to test at the word level. We observe mixed results including suspicious patterns of sensitivity that suggest the test’s assumptions may not hold in general. We conclude by proposing directions for future work on measuring bias in sentence encoders.
Anthology ID:
N19-1063
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
622–628
Language:
URL:
https://aclanthology.org/N19-1063
DOI:
10.18653/v1/N19-1063
Bibkey:
Cite (ACL):
Chandler May, Alex Wang, Shikha Bordia, Samuel R. Bowman, and Rachel Rudinger. 2019. On Measuring Social Biases in Sentence Encoders. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 622–628, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
On Measuring Social Biases in Sentence Encoders (May et al., NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-1063.pdf
Supplementary:
 N19-1063.Supplementary.pdf
Dataset:
 N19-1063.Datasets.zip
Video:
 https://vimeo.com/347394290
Code
 W4ngatang/sent-bias