Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models

Paul Röttger; Haitham Seelawi; Debora Nozza; Zeerak Talat; Bertie Vidgen

doi:10.18653/v1/2022.woah-1.15

Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models

Paul Röttger, Haitham Seelawi, Debora Nozza, Zeerak Talat, Bertie Vidgen

Abstract

Hate speech detection models are typically evaluated on held-out test sets. However, this risks painting an incomplete and potentially misleading picture of model performance because of increasingly well-documented systematic gaps and biases in hate speech datasets. To enable more targeted diagnostic insights, recent research has thus introduced functional tests for hate speech detection models. However, these tests currently only exist for English-language content, which means that they cannot support the development of more effective models in other languages spoken by billions across the world. To help address this issue, we introduce Multilingual HateCheck (MHC), a suite of functional tests for multilingual hate speech detection models. MHC covers 34 functionalities across ten languages, which is more languages than any other hate speech dataset. To illustrate MHC’s utility, we train and test a high-performing multilingual hate speech detection model, and reveal critical model weaknesses for monolingual and cross-lingual applications.

Anthology ID:: 2022.woah-1.15
Volume:: Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)
Month:: July
Year:: 2022
Address:: Seattle, Washington (Hybrid)
Editors:: Kanika Narang, Aida Mostafazadeh Davani, Lambert Mathias, Bertie Vidgen, Zeerak Talat
Venue:: WOAH
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 154–169
Language:
URL:: https://aclanthology.org/2022.woah-1.15/
DOI:: 10.18653/v1/2022.woah-1.15
Bibkey:
Cite (ACL):: Paul Röttger, Haitham Seelawi, Debora Nozza, Zeerak Talat, and Bertie Vidgen. 2022. Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH), pages 154–169, Seattle, Washington (Hybrid). Association for Computational Linguistics.
Cite (Informal):: Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models (Röttger et al., WOAH 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.woah-1.15.pdf
Video:: https://aclanthology.org/2022.woah-1.15.mp4

PDF Cite Search Video Fix data