CMI-AIGCX at GenAI Detection Task 2: Leveraging Multilingual Proxy LLMs for Machine-Generated Text Detection in Academic Essays

Kaijie Jiao, Xingyu Yao, Shixuan Ma, Sifan Fang, Zikang Guo, Benfeng Xu, Licheng Zhang, Quan Wang, Yongdong Zhang, Zhendong Mao


Abstract
This paper presents the approach we proposed for GenAI Detection Task 2, which aims to classify a given text as either machine-generated or human-written, with a particular emphasis on academic essays. We participated in subtasks A and B, which focus on detecting English and Arabic essays, respectively. We propose a simple and efficient method for detecting machine-generated essays, where we use the Llama-3.1-8B as a proxy to capture the essence of each token in the text. These essences are processed and classified using a refined feature classification network. Our approach does not require fine-tuning the LLM. Instead, we leverage its extensive multilingual knowledge acquired during pretraining to significantly enhance detection performance. The results validate the effectiveness of our approach and demonstrate that leveraging a proxy model with diverse multilingual knowledge can significantly enhance the detection of machine-generated text across multiple languages, regardless of model size. In Subtask A, we achieved an F1 score of 99.9%, ranking first out of 26 teams. In Subtask B, we achieved an F1 score of 96.5%, placing fourth out of 22 teams, with the same score as the third-place team.
Anthology ID:
2025.genaidetect-1.32
Volume:
Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect)
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Firoj Alam, Preslav Nakov, Nizar Habash, Iryna Gurevych, Shammur Chowdhury, Artem Shelmanov, Yuxia Wang, Ekaterina Artemova, Mucahid Kutlu, George Mikros
Venues:
GenAIDetect | WS
SIG:
Publisher:
International Conference on Computational Linguistics
Note:
Pages:
290–298
Language:
URL:
https://aclanthology.org/2025.genaidetect-1.32/
DOI:
Bibkey:
Cite (ACL):
Kaijie Jiao, Xingyu Yao, Shixuan Ma, Sifan Fang, Zikang Guo, Benfeng Xu, Licheng Zhang, Quan Wang, Yongdong Zhang, and Zhendong Mao. 2025. CMI-AIGCX at GenAI Detection Task 2: Leveraging Multilingual Proxy LLMs for Machine-Generated Text Detection in Academic Essays. In Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect), pages 290–298, Abu Dhabi, UAE. International Conference on Computational Linguistics.
Cite (Informal):
CMI-AIGCX at GenAI Detection Task 2: Leveraging Multilingual Proxy LLMs for Machine-Generated Text Detection in Academic Essays (Jiao et al., GenAIDetect 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.genaidetect-1.32.pdf