Metrics for What, Metrics for Whom: Assessing Actionability of Bias Evaluation Metrics in NLP

Pieter Delobelle; Giuseppe Attanasio; Debora Nozza; Su Lin Blodgett; Zeerak Talat

doi:10.18653/v1/2024.emnlp-main.1207

Metrics for What, Metrics for Whom: Assessing Actionability of Bias Evaluation Metrics in NLP

Pieter Delobelle, Giuseppe Attanasio, Debora Nozza, Su Lin Blodgett, Zeerak Talat

Abstract

This paper introduces the concept of actionability in the context of bias measures in natural language processing (NLP). We define actionability as the degree to which a measure’s results enable informed action and propose a set of desiderata for assessing it. Building on existing frameworks such as measurement modeling, we argue that actionability is a crucial aspect of bias measures that has been largely overlooked in the literature.We conduct a comprehensive review of 146 papers proposing bias measures in NLP, examining whether and how they provide the information required for actionable results. Our findings reveal that many key elements of actionability, including a measure’s intended use and reliability assessment, are often unclear or entirely absent.This study highlights a significant gap in the current approach to developing and reporting bias measures in NLP. We argue that this lack of clarity may impede the effective implementation and utilization of these measures. To address this issue, we offer recommendations for more comprehensive and actionable metric development and reporting practices in NLP bias research.

Anthology ID:: 2024.emnlp-main.1207
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 21669–21691
Language:
URL:: https://aclanthology.org/2024.emnlp-main.1207/
DOI:: 10.18653/v1/2024.emnlp-main.1207
Bibkey:
Cite (ACL):: Pieter Delobelle, Giuseppe Attanasio, Debora Nozza, Su Lin Blodgett, and Zeerak Talat. 2024. Metrics for What, Metrics for Whom: Assessing Actionability of Bias Evaluation Metrics in NLP. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 21669–21691, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Metrics for What, Metrics for Whom: Assessing Actionability of Bias Evaluation Metrics in NLP (Delobelle et al., EMNLP 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.emnlp-main.1207.pdf

PDF Cite Search Fix data