Cross-Cultural Transfer Learning for Chinese Offensive Language Detection

Detecting offensive language is a challenging task. Generalizing across different cultures and languages becomes even more challenging: besides lexical, syntactic and semantic differences, pragmatic aspects such as cultural norms and sensitivities, which are particularly relevant in this context, vary greatly. In this paper, we target Chinese offensive language detection and aim to investigate the impact of transfer learning using offensive language detection data from different cultural backgrounds, specifically Korean and English. We find that culture-specific biases in what is considered offensive negatively impact the transferability of language models (LMs) and that LMs trained on diverse cultural data are sensitive to different features in Chinese offensive language detection. In a few-shot learning scenario, however, our study shows promising prospects for non-English offensive language detection with limited resources. Our findings highlight the importance of cross-cultural transfer learning in improving offensive language detection and promoting inclusive digital spaces.


Introduction
The proliferation of offensive language and hate speech in online platforms, especially on social media, has significantly increased in recent years (Zampieri et al., 2019(Zampieri et al., , 2020;;Gao et al., 2020).There is a fine line between offensive language and hate speech as few universal definitions exist (Davidson et al., 2017).Therefore, hate speech can be classified as a subtype of offensive language.In this paper, we do not differentiate them in detail, and instead, refer to the task of offensive language detection (OLD).
Despite numerous breakthroughs in the development of NLP methods for OLD (Liu et al., 2022;Rusert et al., 2022), some significant obstacles remain unsolved (Vidgen et al., 2019), including the shortage of data resources for research purposes and bias in human annotation.Since most of the available approaches and resources for OLD are designed for English (Arango Monnar et al., 2022), the resulting trained models operate within a monocultural background that caters to English speakers. 1However, Schmidt and Wiegand (2017) believe that OLD has strong cultural implications, unlike other NLP tasks, because an utterance's offensiveness can vary based on an individual's cultural background.
People with different backgrounds react to inputs differently and communicate differently, so their tolerance for the presence of offensive terms, e.g., slur, may differ, as well as what is altogether considered offensive (Jay and Janschewitz, 2008).Cultural differences have been explored in humor perception (Jiang et al., 2019), swearing reception (Pavesi and Zamora, 2022), translation in semantic inconsistencies (Sperber et al., 1994) and honorifics expression (Song, 2015;Liu and Kobayashi, 2022).Even in less obvious cases, however, they bear meaningful significance on how to pose and solve NLP tasks, as cultures differ with respect to style, values, common ground and topics of interest (Hershcovich et al., 2022).
Therefore, we argue that there is a need for addressing cross-cultural aspects in offensive language detection.Although culture is intricate and challenging to define clearly, language still remains as one of the most straightforward manifestation of culture.While recent work (Ringel et al., 2019;Ranasinghe and Zampieri, 2021) has demonstrated the effectiveness of cross-lingual transfer learning 1 Importantly, "culture" is multifaceted and complex.When referring to English speakers, we assume that there are general unique features that characterize them, but of course there is enormous diversity within speakers of the same language.As a first step towards the analysis of cross-cultural OLD, we restrict ourselves to the level of language categories. in the text classification and offensive Language (hate speech) detection, they don't consider the impact of cultural background differences (e.g., Eastern and Western culture).In this paper, we take a step forward in this direction and explore the influence of offensive content from diverse cultural background on OLD, focusing on evaluation in Chinese.
Our contributions are as follows: 1) We explore the impact of transfer learning using offensive language data from different cultural backgrounds on Chinese offensive language detection ( §3). 2) We find cultural differences in offensive language are expressed in the text topics, and that LMs are sensitive to these differences, learning culture-specific biases that negatively impact their transfer ability ( §4). 3) We find that in the few-shot scenario, even with very limited Chinese examples, the model quickly adapts to the target culture.

Related work
Offensive language detection.Although most of the research on OLD has focused on English (Fortuna and Nunes, 2018), there exist datasets in multiple languages: Chinese (Deng et al., 2022), Korean (Jeong et al., 2022), Danish (Sigurbergsson and Derczynski, 2020), Bengali (Das et al., 2022), andNepali (Niraula et al., 2021), to name a few.However, language models commonly rely on prior distributions from training data, that reflects a discourse that is temporally and culturally situated (Ghosh et al., 2021).In a comprehensive analysis of geographically-related content and its influence on performance disparities of offensive language detection models, Lwowski et al. (2022) find that current models do not gen-eralize across locations.Sap et al. (2022) call for contextualizing offensive (toxicity) labels in social variables as determining what is toxic is subjective, and annotator beliefs can be reflected in the data collected.
Cross-lingual transfer learning.Cross-lingual transfer appears as a potential solution to the issue of language-specific resource scarcity (Lamprinidis et al., 2021).Nozza (2021) demonstrates the limits of cross-lingual zero-shot transfer for hate speech detection in English, Italian and Spanish.The benefits of few-shot learning is evident in works from Stappen et al. (2020) and Röttger et al. (2022), who confirmed the effectiveness of fewshot learning for the task of hate speech detection in under-resourced languages.Ringel et al. (2019) harness cross-cultural differences for English formality and sarcasm detection based on German and Japanese, respectively.Litvak et al. (2022) show that, in the context of OLD, knowledge transfer is not bidirectional and efficient transfer learning holds from Arabic to Hebrew in terms of recall.

Method 3.1 Datasets
To explore the influence of different cultural backgrounds on Chinese OLD, the most straightforward approach is to adopt OLD datasets whose context and annotation process reflect diverse cultural backgrounds.We first select COLD (Deng et al., 2022), a Chinese benchmark dataset covering the topics of racial, gender, and regional bias as our test dataset.We then select two other datasets that will be used in different training scenarios (see § 3.2): KOLD (Jeong et al., 2022), a Korean dataset suited for OLD covering topics such as race, gender, political affiliation and religion; and HatEn, the English subset of HatEval (Basile et al., 2019) composed of tweets which tends to capture a Western cultural background.Table 1 reports the statistics of the three datasets and the topic distributions of COLD.Notably, the three languages come from three different language families, making linguistic similarities between them less likely to be a factor in effective transfer learning between the datasets.

Learning settings
We explore different learning settings by utilizing intra-cultural and cross-cultural training sets during fine-tuning.mBERT base (Devlin et al., 2019), XLM-R base and XLM-R large (Conneau et al., 2020).In the translated data setting, we apply the English models BERT base (Devlin et al., 2019), RoBERTa base and RoBERTa large (Liu et al., 2019).
Our models are optimized with a learning rate of 5e − 5. We fine-tune each model for 100 epochs using early-stopping with a patience of 5, and run 5 times with different random seeds for each setting.
Overall results.The experimental results on COLD test set are shown in Table 2. 4 Compared to the intra-cultural setting, we find that: 1) In the cross-cultural few-shot scenario, the performance differences between D [COLD] and D [CO + KO], D [COLD] and D [CO + HE] are both very small (less than one point at the maximum), which implies that with sufficient knowledge of the Chinese target culture, the intervention of other cultures does not diminish the ability to detect Chinese offensive language, but has a slight contribution.2) In the cross-cultural zeroshot scenario, the detection ability of D [KOLD] and D [HatEn] get worse.In particular, the former is slightly better than the latter.This implies that it is easier to detect Chinese offensive language in Korean cultural background compared to a Western cultural background.
To better understand the detection ability of Chinese offensive language with different cultural backgrounds, we look closer at offensive detection results for the intra-cultural and cross-cultural zeroshot settings.Figure 1 shows the distribution of the data and the predictions from our best performing model XLM-R large .First, D [COLD], which is in the same cultural background as the test set, has the best ability to detect offense.D [HatEn] is the worst detector, with less than 50% accuracy for offensive data.Because of this, it can be highly accurate in non-offensive data.This is why D [HatEn] gets a spurious high accuracy on the test set but a very low F1 score (Table 2).However, it is noteworthy that the HatEn-trained model requires more severe language to be labeled as offensive,5 so some instances that should be classified as offensive, may not be considered hate speech and will not be classified as such.Moreover, for specific-topic offensive language detection, the performance of each detector is also different, with D [HatEn] performing the worst in the regional topic.
Translated results.For the experiments of the translated version of the Chinese and Korean datasets into English.The experimental results are shown in Table 3, showing similar trends to the results in Table 2.This demonstrates that the results hold for cross-cultural transfer and are not simply due to linguistic similarities.
Few-shot learning.While the diverse cultural backgrounds of Korean and English may not enable precise detection of Chinese offensive lan-  guage in a zero-shot scenario, it is not detrimental when integrated into the target culture in a few-shot scenario.Therefore, when mixing heterogeneous cultural background knowledge, is it necessary to provide sufficient target cultural background knowledge?To investigate this problem, we conduct an analytical experiment under a few-shot setting by incorporating different scales of COLD data into the training set. Figure 2 displays experimental results indicating that the correlation between the ability to detect offensive language and target cultural knowledge follows a pattern similar to that of an increasing logarithmic function.This implies that offensive language detection performance improves rapidly with limited target cultural knowledge acquisition, but gradually slows down as the amount of target knowledge increases.Specifically, when the training focuses on COLD within the range of 1 to 50, D [COLD] possesses limited knowledge of the training concentration, and its detection capability stems primarily from the pretraining model itself.At this stage, HatEn has a clearly negative effect, while KOLD has a positive effect.Within the range of 50 to 500, both HatEn and KOLD have an positive effect, while for COLD data scales greater than 500, the effect is still present but less pronounced.These findings offer promising opportunities for low-resource offensive language detection systems.
Case study.To provide an intuitive explanation of cultural differences, we use semantic similarity retrieval (Reimers and Gurevych, 2019) to find the most similar cases from KOLD to COLD with the similarity threshold set to 0.7.As depicted in Table 4, sentences with similar topics and semantics (e.g.racial discrimination, politics) hold different labels among languages, suggesting the presence of cultural distinctions in offensive language detection and highlighting the significant obstacles for few-shot learning.Thus, we emphasize the necessity of greater cultural adaptation models that can integrate diverse cultural knowledge.
America with a black president, against blacks?now racism has disappeared?
ᄌ ᅡ ᆯᄅ ᅵ ᆯᄁ ᅡ.. How can it be so easy What will happen if a to get drugs in China.
criminal is caught in China?
Table 4: Cases with reversed labels through semantic vector retrieval were listed, suggesting the existence of cultural differences across languages.Non-offensive and offensive cases are labeled as 0 and 1.

Conclusion
Our study highlights the challenges of detecting offensive language across different cultures and languages.We show that transfer learning using data from diverse cultural backgrounds have different negative effects on the transferability of language models due to culture-specific biases.However, our findings also indicate promising prospects for improving offensive language detection in promoting inclusive digital spaces, particularly in a few-shot learning scenario.We call for more research on cross-cultural offensive language detection, which is important to deploy effective moderation strategies for social media platforms, improving crosscultural communication, and reducing harmful online behavior.

Limitations
Our study explores the impact of transfer learning on offensive language detection using data from different cultural backgrounds.However, treating HatEn as representative of "Western cultural backgroun" is too vague, as it ignores the cultural differences between American and British cultures.Moreover, "culture" is multifaceted and complex, and there is enormous diversity among speakers of the same language.To focus on language categories, we limit our analysis to a first step towards cross-cultural offensive language detection.

Ethics Statement
The datasets used in this study are publicly available, and we strictly follow the ethical implications of previous research related to the data sources.It is important to note that the content of these datasets does not represent our opinions or views.

Figure 1 :
Figure 1: A fine-grained view of the distribution of offensive detection results based on XLM-R large .For reference, the colored part represent the distribution of related data in COLD test set.The model learns culture-specific biases-e.g., when training on English, it tends not to classify region-related text as offensive.

Figure 2 :
Figure 2: The experimental results (F1) in few-shot setting based on XLM-R large , evaluated on the COLD (Chinese) test set.Performance improves rapidly with training examples from the target culture.Pre-training on KOLD (Korean) provides a better starting point, while pre-training on HatEn (English) is detrimental.

Table 1 :
Datasets statistics (top) and topic distributions of COLD (bottom).Particularly, statistics of offensive and non-offensive data and the ratio between them are indicated in parentheses.

Table 2 :
For the intra-cultural setting, we only use COLD as the training set, which en-Translated data setting.As an additional control experiment, to avoid the difference from the language itself, we also translate COLD and KOLD into English with googletrans 2 and conduct experiments with English PLMs under the same settings.Overall results on COLD test set.† marks KOLD training set is the same size as HatEn.CO, KO and HE are short for COLD, KOLD and HatEn respec- sures cultural consistency in the training and testing process.In the cross-cultural setting, we further set up two ways: 1) zero-shot: only use KOLD or HatEn as the training set, which makes the finetuning process of LMs come from completely different cultural backgrounds; 2) mix-training fewshot: mix COLD with another language (KOLD or HatEn) as the final training set, which introduces cultural interference and makes the acquisition of the target culture more challenging.For convenience, we use D [X] to represent the detector with X as training set.Since the datasets are in different languages, we apply multilingual LMs in these experiments. 3from the Korean training set to ensure the consistency of the training data sizes with HatEn.For the multilingual LMs, we choose 2 https://pypi.org/project/googletrans/ 3The ratio of offensive data to non-offensive data is 0.96.tively.By conducting Paired Student's t-test, * = differs significantly from intra-cultural at p < 0.05, * * = significant difference at p < 0.01.

Table 3 :
The experimental results on the COLD test set, with all training and testing data translated to English.† marks KOLD training set is the same size as HatEn.