Oussama Hansal
2023
Challenges and Issue of Gender Bias in Under-Represented Languages: An Empirical Study on Inuktitut-English NMT
Ngoc Tan Le
|
Oussama Hansal
|
Fatiha Sadat
Proceedings of the Sixth Workshop on the Use of Computational Methods in the Study of Endangered Languages
2022
Indigenous Language Revitalization and the Dilemma of Gender Bias
Oussama Hansal
|
Ngoc Tan Le
|
Fatiha Sadat
Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
Natural Language Processing (NLP), through its several applications, has been considered as one of the most valuable field in interdisciplinary researches, as well as in computer science. However, it is not without its flaws. One of the most common flaws is bias. This paper examines the main linguistic challenges of Inuktitut, an indigenous language of Canada, and focuses on gender bias identification and mitigation. We explore the unique characteristics of this language to help us understand the right techniques that can be used to identify and mitigate implicit biases. We use some methods to quantify the gender bias existing in Inuktitut word embeddings; then we proceed to mitigate the bias and evaluate the performance of the debiased embeddings. Next, we explain how approaches for detecting and reducing bias in English embeddings may be transferred to Inuktitut embeddings by properly taking into account the language’s particular characteristics. Next, we compare the effect of the debiasing techniques on Inuktitut and English. Finally, we highlight some future research directions which will further help to push the boundaries.
Search