Unraveling the Search Space of Abusive Language in Wikipedia with Dynamic Lexicon Acquisition

Wei-Fan Chen; Khalid Al Khatib; Matthias Hagen; Henning Wachsmuth; Benno Stein

doi:10.18653/v1/D19-5009

Unraveling the Search Space of Abusive Language in Wikipedia with Dynamic Lexicon Acquisition

Wei-Fan Chen, Khalid Al Khatib, Matthias Hagen, Henning Wachsmuth, Benno Stein

Abstract

Many discussions on online platforms suffer from users offending others by using abusive terminology, threatening each other, or being sarcastic. Since an automatic detection of abusive language can support human moderators of online discussion platforms, detecting abusiveness has recently received increased attention. However, the existing approaches simply train one classifier for the whole variety of abusiveness. In contrast, our approach is to distinguish explicitly abusive cases from the more “shadowed” ones. By dynamically extending a lexicon of abusive terms (e.g., including new obfuscations of abusive terms), our approach can support a moderator with explicit unraveled explanations for why something was flagged as abusive: due to known explicitly abusive terms, due to newly detected (obfuscated) terms, or due to shadowed cases.

Anthology ID:: D19-5009
Volume:: Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda
Month:: November
Year:: 2019
Address:: Hong Kong, China
Editors:: Anna Feldman, Giovanni Da San Martino, Alberto Barrón-Cedeño, Chris Brew, Chris Leberknight, Preslav Nakov
Venue:: NLP4IF
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 76–82
Language:
URL:: https://aclanthology.org/D19-5009/
DOI:: 10.18653/v1/D19-5009
Bibkey:
Cite (ACL):: Wei-Fan Chen, Khalid Al Khatib, Matthias Hagen, Henning Wachsmuth, and Benno Stein. 2019. Unraveling the Search Space of Abusive Language in Wikipedia with Dynamic Lexicon Acquisition. In Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda, pages 76–82, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):: Unraveling the Search Space of Abusive Language in Wikipedia with Dynamic Lexicon Acquisition (Chen et al., NLP4IF 2019)
Copy Citation:
PDF:: https://aclanthology.org/D19-5009.pdf

PDF Cite Search Fix data