CIC-NLP at GenAI Detection Task 1: Leveraging DistilBERT for Detecting Machine-Generated Text in English

Tolulope Olalekan Abiola; Tewodros Achamaleh Bizuneh; Oluwatobi Joseph Abiola; Temitope Olasunkanmi Oladepo; Olumide Ebenezer Ojo; Grigori Sidorov; Olga Kolesnikova

CIC-NLP at GenAI Detection Task 1: Leveraging DistilBERT for Detecting Machine-Generated Text in English

Tolulope Olalekan Abiola, Tewodros Achamaleh Bizuneh, Oluwatobi Joseph Abiola, Temitope Olasunkanmi Oladepo, Olumide Ebenezer Ojo, Grigori Sidorov, Olga Kolesnikova

Abstract

As machine-generated texts (MGT) become increasingly similar to human writing, these dis- tinctions are harder to identify. In this paper, we as the CIC-NLP team present our submission to the Gen-AI Content Detection Workshop at COLING 2025 for Task 1 Subtask A, which involves distinguishing between text generated by LLMs and text authored by humans, with an emphasis on detecting English-only MGT. We applied the DistilBERT model to this binary classification task using the dataset provided by the organizers. Fine-tuning the model effectively differentiated between the classes, resulting in a micro-average F1-score of 0.70 on the evaluation test set. We provide a detailed explanation of the fine-tuning parameters and steps involved in our analysis.

Anthology ID:: 2025.genaidetect-1.29
Volume:: Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect)
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Firoj Alam, Preslav Nakov, Nizar Habash, Iryna Gurevych, Shammur Chowdhury, Artem Shelmanov, Yuxia Wang, Ekaterina Artemova, Mucahid Kutlu, George Mikros
Venues:: GenAIDetect | WS
SIG:
Publisher:: International Conference on Computational Linguistics
Note:
Pages:: 271–277
Language:
URL:: https://aclanthology.org/2025.genaidetect-1.29/
DOI:
Bibkey:
Cite (ACL):: Tolulope Olalekan Abiola, Tewodros Achamaleh Bizuneh, Oluwatobi Joseph Abiola, Temitope Olasunkanmi Oladepo, Olumide Ebenezer Ojo, Grigori Sidorov, and Olga Kolesnikova. 2025. CIC-NLP at GenAI Detection Task 1: Leveraging DistilBERT for Detecting Machine-Generated Text in English. In Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect), pages 271–277, Abu Dhabi, UAE. International Conference on Computational Linguistics.
Cite (Informal):: CIC-NLP at GenAI Detection Task 1: Leveraging DistilBERT for Detecting Machine-Generated Text in English (Abiola et al., GenAIDetect 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.genaidetect-1.29.pdf
Optionalsupplementarymaterial:: 2025.genaidetect-1.29.OptionalSupplementaryMaterial.txt

PDF Cite Search Optionalsupplementarymaterial Fix data