Xiaochao Fan


2024

pdf bib
Giving Control Back to Models: Enabling Offensive Language Detection Models to Autonomously Identify and Mitigate Biases
Jiapeng Liu | Weijie Li | Xiaochao Fan | Wenjun Deng | Liang Yang | Yong Li | Yufeng Diao
Findings of the Association for Computational Linguistics: EMNLP 2024

The rapid development of social media has led to an increase in online harassment and offensive speech, posing significant challenges for effective content moderation. Existing automated detection models often exhibit a bias towards predicting offensive speech based on specific vocabulary, which not only compromises model fairness but also potentially exacerbates biases against vulnerable and minority groups. Addressing these issues, this paper proposes a bias self-awareness and data self-iteration framework for mitigating model biases. This framework aims to “giving control back to models: enabling offensive language detection models to autonomously identify and mitigate biases” through bias self-awareness algorithms and self-iterative data augmentation method. Experimental results demonstrate that the proposed framework effectively reduces the false positive rate of models in both in-distribution and out-of-distribution tests, enhances model accuracy and fairness, and shows promising performance improvements in detecting offensive speech on larger-scale datasets.

2021

pdf bib
Hate Speech Detection Based on Sentiment Knowledge Sharing
Xianbing Zhou | Yang Yong | Xiaochao Fan | Ge Ren | Yunfeng Song | Yufeng Diao | Liang Yang | Hongfei Lin
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

The wanton spread of hate speech on the internet brings great harm to society and families. It is urgent to establish and improve automatic detection and active avoidance mechanisms for hate speech. While there exist methods for hate speech detection, they stereotype words and hence suffer from inherently biased training. In other words, getting more affective features from other affective resources will significantly affect the performance of hate speech detection. In this paper, we propose a hate speech detection framework based on sentiment knowledge sharing. While extracting the affective features of the target sentence itself, we make better use of the sentiment features from external resources, and finally fuse features from different feature extraction units to detect hate speech. Experimental results on two public datasets demonstrate the effectiveness of our model.

2020

pdf bib
基于多粒度语义交互理解网络的幽默等级识别(A Multi-Granularity Semantic Interaction Understanding Network for Humor Level Recognition)
Jinhui Zhang (张瑾晖) | Shaowu Zhang (张绍武) | Xiaochao Fan (樊小超) | Liang Yang (杨亮) | Hongfei Lin (林鸿飞)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

幽默在人们日常交流中发挥着重要作用。随着人工智能的快速发展,幽默等级识别成为自然语言处理领域的热点研究问题之一。已有的幽默等级识别研究往往将幽默文本看作一个整体,忽视了幽默文本内部的语义关系。本文将幽默等级识别视为自然语言推理任务,将幽默文本划分为“铺垫”和“笑点”两个部分,分别对其语义和语义关系进行建模,提出了一种多粒度语义交互理解网络,从单词和子句两个粒度捕获幽默文本中语义的关联和交互。本文在Reddit公开幽默数据集上进行了实验,相比之前最优结果,模型在语料上的准确率提升了1.3%。实验表明,引入幽默内部的语义关系信息可以提高模型幽默识别的性能,而本文提出的模型也可以很好地建模这种语义关系。