Aniket Vashishtha
2023
Performance and Risk Trade-offs for Multi-word Text Prediction at Scale
Aniket Vashishtha
|
S Sai Prasad
|
Payal Bajaj
|
Vishrav Chaudhary
|
Kate Cook
|
Sandipan Dandapat
|
Sunayana Sitaram
|
Monojit Choudhury
Findings of the Association for Computational Linguistics: EACL 2023
Large Language Models such as GPT-3 are well-suited for text prediction tasks, which can help and delight users during text composition. LLMs are known to generate ethically inappropriate predictions even for seemingly innocuous contexts. Toxicity detection followed by filtering is a common strategy for mitigating the harm from such predictions. However, as we shall argue in this paper, in the context of text prediction, it is not sufficient to detect and filter toxic content. One also needs to ensure factual correctness and group-level fairness of the predictions; failing to do so can make the system ineffective and nonsensical at best, and unfair and detrimental to the users at worst. We discuss the gaps and challenges of toxicity detection approaches - from blocklist-based approaches to sophisticated state-of-the-art neural classifiers - by evaluating them on the text prediction task for English against a manually crafted CheckList of harms targeted at different groups and different levels of severity.
On Evaluating and Mitigating Gender Biases in Multilingual Settings
Aniket Vashishtha
|
Kabir Ahuja
|
Sunayana Sitaram
Findings of the Association for Computational Linguistics: ACL 2023
While understanding and removing gender biases in language models has been a long-standing problem in Natural Language Processing, prior research work has primarily been limited to English. In this work, we investigate some of the challenges with evaluating and mitigating biases in multilingual settings which stem from a lack of existing benchmarks and resources for bias evaluation beyond English especially for non-western context. In this paper, we first create a benchmark for evaluating gender biases in pre-trained masked language models by extending DisCo to different Indian languages using human annotations. We extend various debiasing methods to work beyond English and evaluate their effectiveness for SOTA massively multilingual models on our proposed metric. Overall, our work highlights the challenges that arise while studying social biases in multilingual settings and provides resources as well as mitigation techniques to take a step toward scaling to more languages.
Search
Fix data
Co-authors
- Sunayana Sitaram 2
- Kabir Ahuja 1
- Payal Bajaj 1
- Vishrav Chaudhary 1
- Monojit Choudhury 1
- show all...