Gender Identity in Pretrained Language Models: An Inclusive Approach to Data Creation and Probing

Urban Knupleš, Agnieszka Falenska, Filip Miletić


Abstract
Pretrained language models (PLMs) have been shown to encode binary gender information of text authors, raising the risk of skewed representations and downstream harms. This effect is yet to be examined for transgender and non-binary identities, whose frequent marginalization may exacerbate harmful system behaviors. Addressing this gap, we first create TRANsCRIPT, a corpus of YouTube transcripts from transgender, cisgender, and non-binary speakers. Using this dataset, we probe various PLMs to assess if they encode the gender identity information, examining both frozen and fine-tuned representations as well as representations for inputs with author-specific words removed. Our findings reveal that PLM representations encode information for all gender identities but to different extents. The divergence is most pronounced for cis women and non-binary individuals, underscoring the critical need for gender-inclusive approaches to NLP systems.
Anthology ID:
2024.findings-emnlp.680
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11612–11631
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.680
DOI:
Bibkey:
Cite (ACL):
Urban Knupleš, Agnieszka Falenska, and Filip Miletić. 2024. Gender Identity in Pretrained Language Models: An Inclusive Approach to Data Creation and Probing. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 11612–11631, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Gender Identity in Pretrained Language Models: An Inclusive Approach to Data Creation and Probing (Knupleš et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.680.pdf