Duke Nguyen


2024

pdf bib
“Is Hate Lost in Translation?”: Evaluation of Multilingual LGBTQIA+ Hate Speech Detection
Fai Leui Chan | Duke Nguyen | Aditya Joshi
Proceedings of the 22nd Annual Workshop of the Australasian Language Technology Association

This paper explores the challenges of detecting LGBTQIA+ hate speech of large language models across multiple languages, including English, Italian, Chinese and (code-mixed) English-Tamil, examining the impact of machine translation and whether the nuances of hate speech are preserved across translation. We examine the hate speech detection ability of zero-shot and fine-tuned GPT. Our findings indicate that: (1) English has the highest performance and the code-mixing scenario of English-Tamil being the lowest, (2) fine-tuning improves performance consistently across languages whilst translation yields mixed results. Through simple experimentation with original text and machine-translated text for hate speech detection along with a qualitative error analysis, this paper sheds light on the socio-cultural nuances and complexities of languages that may not be captured by automatic translation.

2023

pdf bib
Stacking the Odds: Transformer-Based Ensemble for AI-Generated Text Detection
Duke Nguyen | Khaing Myat Noe Naing | Aditya Joshi
Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association

This paper reports our submission under the team name ‘SynthDetectives’ to the ALTA 2023 Shared Task. We use a stacking ensemble of Transformers for the task of AI-generated text detection. Our approach is novel in terms of its choice of models in that we use accessible and lightweight models in the ensemble. We show that ensembling the models results in an improved accuracy in comparison with using them individually. Our approach achieves an accuracy score of 0.9555 on the official test data provided by the shared task organisers.