Debiasing Word Embeddings with Nonlinear Geometry

Lu Cheng, Nayoung Kim, Huan Liu


Abstract
Debiasing word embeddings has been largely limited to individual and independent social categories. However, real-world corpora typically present multiple social categories that possibly correlate or intersect with each other. For instance, “hair weaves” is stereotypically associated with African American females, but neither African American nor females alone. Therefore, this work studies biases associated with multiple social categories: joint biases induced by the union of different categories and intersectional biases that do not overlap with the biases of the constituent categories. We first empirically observe that individual biases intersect non-trivially (i.e., over a one-dimensional subspace). Drawing from the intersectional theory in social science and the linguistic theory, we then construct an intersectional subspace to debias for multiple social categories using the nonlinear geometry of individual biases. Empirical evaluations corroborate the efficacy of our approach.
Anthology ID:
2022.coling-1.110
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
1286–1298
Language:
URL:
https://aclanthology.org/2022.coling-1.110
DOI:
Bibkey:
Cite (ACL):
Lu Cheng, Nayoung Kim, and Huan Liu. 2022. Debiasing Word Embeddings with Nonlinear Geometry. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1286–1298, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Debiasing Word Embeddings with Nonlinear Geometry (Cheng et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.110.pdf
Code
 githublucheng/implementation-of-josec-coling-22