Categorizing Offensive Language in Social Networks: A Chinese Corpus, Systems and an Explainable Tool

Xiangru Tang, Xianjun Shen


Abstract
Recently, more and more data have been generated in the online world, filled with offensive language such as threats, swear words or straightforward insults. It is disgraceful for a progressive society, and then the question arises on how language resources and technologies can cope with this challenge. However, previous work only analyzes the problem as a whole but fails to detect particular types of offensive content in a more fine-grained way, mainly because of the lack of annotated data. In this work, we present a densely annotated data-set COLA
Anthology ID:
2020.ccl-1.97
Volume:
Proceedings of the 19th Chinese National Conference on Computational Linguistics
Month:
October
Year:
2020
Address:
Haikou, China
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
1045–1056
Language:
English
URL:
https://aclanthology.org/2020.ccl-1.97
DOI:
Bibkey:
Cite (ACL):
Xiangru Tang and Xianjun Shen. 2020. Categorizing Offensive Language in Social Networks: A Chinese Corpus, Systems and an Explainable Tool. In Proceedings of the 19th Chinese National Conference on Computational Linguistics, pages 1045–1056, Haikou, China. Chinese Information Processing Society of China.
Cite (Informal):
Categorizing Offensive Language in Social Networks: A Chinese Corpus, Systems and an Explainable Tool (Tang & Shen, CCL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.ccl-1.97.pdf