Towards Context-Based Violence Detection: A Korean Crime Dialogue Dataset

Minju Kim; Heuiyeen Yeen; Myoung-Wan Koo

doi:10.18653/v1/2024.findings-eacl.42

Towards Context-Based Violence Detection: A Korean Crime Dialogue Dataset

Minju Kim, Heuiyeen Yeen, Myoung-Wan Koo

Abstract

In order to enhance the security of society, there is rising interest in artificial intelligence (AI) to help detect and classify in advanced violence in daily life. The field of violence detection has introduced various datasets, yet context-based violence detection predominantly focuses on vision data, with a notable lack of NLP datasets. To overcome this, this paper presents the first Korean dialogue dataset for classifying violence that occurs in online settings: the Korean Crime Dialogue Dataset (KCDD). KCDD contains 22,249 dialogues created by crowd workers assuming offline scenarios. It has four criminal classes that meet international legal standards and one clean class (Serious Threats, Extortion or Blackmail, Harassment in the Workplace, Other Harassment, and Clean Dialogue). Plus, we propose a strong baseline for the proposed dataset, Relationship-Aware BERT. The model shows that understanding varying relationships among interlocutors improves the performance of crime dialogue classification. We hope that the proposed dataset will be used to detect cases of violence and aid people in danger. The KCDD dataset and corresponding baseline implementations can be found at the following link: https://sites.google.com/view/kcdd.

Anthology ID:: 2024.findings-eacl.42
Volume:: Findings of the Association for Computational Linguistics: EACL 2024
Month:: March
Year:: 2024
Address:: St. Julian’s, Malta
Editors:: Yvette Graham, Matthew Purver
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 603–623
Language:
URL:: https://aclanthology.org/2024.findings-eacl.42/
DOI:: 10.18653/v1/2024.findings-eacl.42
Bibkey:
Cite (ACL):: Minju Kim, Heuiyeen Yeen, and Myoung-Wan Koo. 2024. Towards Context-Based Violence Detection: A Korean Crime Dialogue Dataset. In Findings of the Association for Computational Linguistics: EACL 2024, pages 603–623, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):: Towards Context-Based Violence Detection: A Korean Crime Dialogue Dataset (Kim et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-eacl.42.pdf
Software:: 2024.findings-eacl.42.software.zip
Note:: 2024.findings-eacl.42.note.zip

PDF Cite Search Software Note Fix data