Didier Fernando Salazar Estrada


2024

pdf bib
Leveraging Conflicts in Social Media Posts: Unintended Offense Dataset
Che Wei Tsai | Yen-Hao Huang | Tsu-Keng Liao | Didier Fernando Salazar Estrada | Retnani Latifah | Yi-Shin Chen
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

In multi-person communications, conflicts often arise. Each individual may have their own perspective, which can differ. Additionally, commonly referenced offensive datasets frequently neglect contextual information and are primarily constructed with a focus on intended offenses. This study suggests that conflicts are pivotal in revealing a broader range of human interactions, including instances of unintended offensive language. This paper proposes a conflict-based data collection method to utilize inter-conflict cues in multi-person communications. By focusing on specific cue posts within conversation threads, our proposed approach effectively identifies relevant instances for analysis. Detailed analyses are provided to showcase the proposed approach efficiently gathers data on subtly offensive content. The experimental results indicate that incorporating elements of conflict into data collection significantly enhances the comprehensiveness and accuracy of detecting offensive language but also enriches our understanding of conflict dynamics in digital communication.