Mohammad ALSmadi
2025
IntegrityAI at GenAI Detection Task 2: Detecting Machine-Generated Academic Essays in English and Arabic Using ELECTRA and Stylometry
Mohammad ALSmadi
Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect)
We present a robust system for detecting machine-generated academic essays, leveraging pre-trained, transformer-based models specifically tailored for both English and Arabic texts. Our primary approach utilizes ELECTRA-Small for English and AraELECTRA-Base for Arabic, fine-tuned to deliver high performance while balancing computational efficiency. By incorporating stylometric features, such as word count, sentence length, and vocabulary richness, our models excel at distinguishing between human-written and AI-generated content. Proposed models achieved excellent results with an F1- score of 99.7%, ranking second among of 26 teams in the English subtask, and 98.4%, finishing first out of 23 teams in the Arabic one. Main Contributions include: (1) We develop lightweight and efficient models using ELECTRA-Small and AraELECTRA-Base, achieving an impressive F1-score of 98.5% on the English dataset and 98.4% on the Arabic dataset. This demonstrates the power of combining transformer-based architectures with stylometric analysis. (2) We optimize our system to maintain high performance while being computationally efficient, making it suitable for deployment on GPUs with moderate memory capacity. (3) Additionally, we tested larger models, such as ELECTRA-Large, achieving an even higher F1-score of 99.7% on the English dataset, highlighting the potential for further accuracy gains when using more computationally intensive models.