@inproceedings{arango-monnar-etal-2022-resources,
title = "Resources for Multilingual Hate Speech Detection",
author = "Arango Monnar, Ayme and
Perez, Jorge and
Poblete, Barbara and
Salda{\~n}a, Magdalena and
Proust, Valentina",
editor = "Narang, Kanika and
Mostafazadeh Davani, Aida and
Mathias, Lambert and
Vidgen, Bertie and
Talat, Zeerak",
booktitle = "Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)",
month = jul,
year = "2022",
address = "Seattle, Washington (Hybrid)",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.woah-1.12",
doi = "10.18653/v1/2022.woah-1.12",
pages = "122--130",
abstract = "Most of the published approaches and resources for hate speech detection are tailored for the English language. In consequence, cross-lingual and cross-cultural perspectives lack some essential resources. The lack of diversity of the datasets in Spanish is notable. Variations throughout Spanish-speaking countries make existing datasets not enough to encompass the task in the different Spanish variants. We annotated 9834 tweets from Chile to enrich the existing Spanish resources with different words and new targets of hate that have not been considered in previous studies. We conducted several cross-dataset evaluation experiments of the models published in the literature using our Chilean dataset and two others in English and Spanish. We propose a comparative framework for quickly conducting comparative experiments using different previously published models. In addition, we set up a Codalab competition for further comparison of new models in a standard scenario, that is, data partitions and evaluation metrics. All resources can be accessed trough a centralized repository for researchers to get a complete picture of the progress on the multilingual hate speech and offensive language detection task.",
}
<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="arango-monnar-etal-2022-resources">
<titleInfo>
<title>Resources for Multilingual Hate Speech Detection</title>
</titleInfo>
<name type="personal">
<namePart type="given">Ayme</namePart>
<namePart type="family">Arango Monnar</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Jorge</namePart>
<namePart type="family">Perez</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Barbara</namePart>
<namePart type="family">Poblete</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Magdalena</namePart>
<namePart type="family">Saldaña</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Valentina</namePart>
<namePart type="family">Proust</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<originInfo>
<dateIssued>2022-07</dateIssued>
</originInfo>
<typeOfResource>text</typeOfResource>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)</title>
</titleInfo>
<name type="personal">
<namePart type="given">Kanika</namePart>
<namePart type="family">Narang</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Aida</namePart>
<namePart type="family">Mostafazadeh Davani</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Lambert</namePart>
<namePart type="family">Mathias</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Bertie</namePart>
<namePart type="family">Vidgen</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Zeerak</namePart>
<namePart type="family">Talat</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<originInfo>
<publisher>Association for Computational Linguistics</publisher>
<place>
<placeTerm type="text">Seattle, Washington (Hybrid)</placeTerm>
</place>
</originInfo>
<genre authority="marcgt">conference publication</genre>
</relatedItem>
<abstract>Most of the published approaches and resources for hate speech detection are tailored for the English language. In consequence, cross-lingual and cross-cultural perspectives lack some essential resources. The lack of diversity of the datasets in Spanish is notable. Variations throughout Spanish-speaking countries make existing datasets not enough to encompass the task in the different Spanish variants. We annotated 9834 tweets from Chile to enrich the existing Spanish resources with different words and new targets of hate that have not been considered in previous studies. We conducted several cross-dataset evaluation experiments of the models published in the literature using our Chilean dataset and two others in English and Spanish. We propose a comparative framework for quickly conducting comparative experiments using different previously published models. In addition, we set up a Codalab competition for further comparison of new models in a standard scenario, that is, data partitions and evaluation metrics. All resources can be accessed trough a centralized repository for researchers to get a complete picture of the progress on the multilingual hate speech and offensive language detection task.</abstract>
<identifier type="citekey">arango-monnar-etal-2022-resources</identifier>
<identifier type="doi">10.18653/v1/2022.woah-1.12</identifier>
<location>
<url>https://aclanthology.org/2022.woah-1.12</url>
</location>
<part>
<date>2022-07</date>
<extent unit="page">
<start>122</start>
<end>130</end>
</extent>
</part>
</mods>
</modsCollection>
%0 Conference Proceedings
%T Resources for Multilingual Hate Speech Detection
%A Arango Monnar, Ayme
%A Perez, Jorge
%A Poblete, Barbara
%A Saldaña, Magdalena
%A Proust, Valentina
%Y Narang, Kanika
%Y Mostafazadeh Davani, Aida
%Y Mathias, Lambert
%Y Vidgen, Bertie
%Y Talat, Zeerak
%S Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)
%D 2022
%8 July
%I Association for Computational Linguistics
%C Seattle, Washington (Hybrid)
%F arango-monnar-etal-2022-resources
%X Most of the published approaches and resources for hate speech detection are tailored for the English language. In consequence, cross-lingual and cross-cultural perspectives lack some essential resources. The lack of diversity of the datasets in Spanish is notable. Variations throughout Spanish-speaking countries make existing datasets not enough to encompass the task in the different Spanish variants. We annotated 9834 tweets from Chile to enrich the existing Spanish resources with different words and new targets of hate that have not been considered in previous studies. We conducted several cross-dataset evaluation experiments of the models published in the literature using our Chilean dataset and two others in English and Spanish. We propose a comparative framework for quickly conducting comparative experiments using different previously published models. In addition, we set up a Codalab competition for further comparison of new models in a standard scenario, that is, data partitions and evaluation metrics. All resources can be accessed trough a centralized repository for researchers to get a complete picture of the progress on the multilingual hate speech and offensive language detection task.
%R 10.18653/v1/2022.woah-1.12
%U https://aclanthology.org/2022.woah-1.12
%U https://doi.org/10.18653/v1/2022.woah-1.12
%P 122-130
Markdown (Informal)
[Resources for Multilingual Hate Speech Detection](https://aclanthology.org/2022.woah-1.12) (Arango Monnar et al., WOAH 2022)
ACL
- Ayme Arango Monnar, Jorge Perez, Barbara Poblete, Magdalena Saldaña, and Valentina Proust. 2022. Resources for Multilingual Hate Speech Detection. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH), pages 122–130, Seattle, Washington (Hybrid). Association for Computational Linguistics.