pdf
bib
Proceedings of the Shared Task on Multi-Domain Detection of AI-Generated Text
Salima Lamsiyah
|
Saad Ezzini
|
Abdelkader El Mahdaoui
|
Hamza Alami
|
Abdessamad Benlahbib
|
Samir El Amrani
|
Salmane Chafik
|
Hicham Hammouchi
pdf
bib
abs
M-DAIGT: A Shared Task on Multi-Domain Detection of AI-Generated Text
Salima Lamsiyah
|
Saad Ezzini
|
Abdelkader El Mahdaouy
|
Hamza Alami
|
Abdessamad Benlahbib
|
Samir El amrany
|
Salmane Chafik
|
Hicham Hammouchi
The generation of highly fluent text by Large Language Models (LLMs) poses a significant challenge to information integrity and academic research. In this paper, we introduce the Multi-Domain Detection of AI-Generated Text (M-DAIGT) shared task, which focuses on detecting AI-generated text across multiple domains, particularly in news articles and academic writing. M-DAIGT comprises two binary classification subtasks: News Article Detection (NAD) (Subtask 1) and Academic Writing Detection (AWD) (Subtask 2). To support this task, we developed and released a new large-scale benchmark dataset of 30,000 samples, balanced between human-written and AI-generated texts. The AI-generated content was produced using a variety of modern LLMs (e.g., GPT-4, Claude) and diverse prompting strategies. A total of 46 unique teams registered for the shared task, of which four teams submitted final results. All four teams participated in both Subtask 1 and Subtask 2. We describe the methods employed by these participating teams and briefly discuss future directions for M-DAIGT.
pdf
bib
abs
AI-Generated Text Detection Using DeBERTa with Auxiliary Stylometric Features
Annepaka Yadagiri
|
L. D. M. S Sai Teja
|
Partha Pakray
|
Chukhu Chunka
The global proliferation of Generative Artificial Intelligence (GenAI) has led to the increasing presence of AI-generated text across a wide spectrum of topics, ranging from everyday content to critical and specialized domains. Often, individuals are unaware that the text they interact with was produced by AI systems rather than human authors, leading to instances where AI-generated content is unintentionally combined with human-written material. In response to this growing concern, we propose a novel approach as part of the Multi-Domain AI-Generated Text Detection (M-DAIGT) shared task, which aims to accurately identify AI-generated content across multiple domains, particularly in news reporting and academic writing. Given the rapid evolution of large language models (LLMs), distinguishing between human-authored and AI-generated text has become increasingly challenging. To address this, our method employs fine-tuning strategies using transformer-based language models for binary text classification. We focus on two specific domains, news and scholarly writing, and demonstrate that our approach, based on the DeBERTa transformer model, achieves superior performance in identifying AI-generated text. Our team, CNLP-NITS-PP, achieved 5th position in Subtask 1 and 3rd position in Subtask 2.
pdf
bib
abs
Shared Task on Multi-Domain Detection of AI-Generated Text (M-DAIGT)
Sareem Farooqui
|
Ali Zain
|
Dr Muhammad Rafi
We participated in two subtasks: Subtask 1, focusing on news articles, and Subtask 2, focusing on academic abstracts. Our submission is based on three distinct architectural approaches: (1) Fine-tuning a RoBERTa-base model, (2) A TF-IDF based system with a Linear Support Vector Machine (SVM) classifier, and (3) An experimental system named Candace, which leverages probabilistic features extracted from multiple Llama-3.2 models (1B and 3B variants) fed into a Transformer Encoder-based classifier. Our RoBERTa-based system demonstrated strong performance on the development and test sets for both subtasks and was chosen as our primary submission to both the shared subtasks.
pdf
bib
abs
A Multimodal Transformer-based Approach for Cross-Domain Detection of Machine-Generated Text
Mohammad AL-Smadi
The rapid advancement of large language models (LLMs) has made it increasingly challenging to distinguish between human-written and machine-generated content. This paper presents IntegrityAI, a multimodal ELECTRA-based model for the detection of AI-generated text across multiple domains. Our approach combines textual features processed through a pre-trained ELECTRA model with handcrafted stylometric features to create a robust classifier. We evaluate our system on the Multi-Domain Detection of AI-Generated Text (M-DAIGT) shared task, which focuses on identifying AI-generated content in news articles and academic writing. IntegrityAI achieves exceptional performance and ranked 1st in both subtasks, with F1-scores of 99.6% and 99.9% on the news article detection and academic writing detection subtasks, respectively. Our results demonstrate the effectiveness of combining transformer-based models with stylometric analysis for detecting AI-generated content across diverse domains and writing styles.
pdf
bib
abs
Inside the Box: A Streamlined Model for AI-Generated News Article Detection
Nsrin Ashraf
|
Mariam Labib
|
Hamada Nayel
The rapid proliferation of AI-generated text has raised concerns. With the increasing prevalence of AI-generated content, concerns have grown regarding authenticity, authorship, and the spread of misinformation. Detecting such content accurately and efficiently has become a pressing challenge. In this study, we propose a simple yet effective system for classifying AI-generated versus human-written text. Rather than relying on complex or resource-intensive deep learning architectures, our approach leverages classical machine learning algorithms combined with the TF-IDF text representation technique. Evaluated on the M-DAIGT shared task dataset, our Support Vector Machine (SVM) based system achieved strong results, ranking second on the official leaderboard and demonstrating competitive performance across all evaluation metrics. These findings highlight the potential of traditional lightweight models to address modern challenges in text authenticity detection, particularly in low-resource or real-time applications where interpretability and efficiency are essential.