Mandadoddi Srikar Vardhan
2025
CNLP-NITS-PP at GenAI Detection Task 1: AI-Generated Text Using Transformer-Based Approaches
Annepaka Yadagiri
|
Sai Teja Lekkala
|
Mandadoddi Srikar Vardhan
|
Partha Pakray
|
Reddi Mohana Krishna
Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect)
In the current digital landscape, distinguishing between text generated by humans and that created by large language models has become increasingly complex. This challenge is exacerbated by advanced LLMs such as the Gemini, ChatGPT, GPT-4, and LLaMa, which can produce highly sophisticated, human-like text. This indistinguishability introduces a range of challenges across different sectors. Cybersecurity increases the risk of social engineering and misinformation, while social media aids the spread of biased or false content. The educational sector faces issues of academic integrity, and within large, multi-team environments, these models add complexity to managing interactions between human and AI agents. To address these challenges, we approached the problem as a binary classification task using an English-language benchmark COLING dataset. We employed transformer-based neural network models, including BERT, DistilBERT, and RoBERTa, fine-tuning each model with optimized hyperparameters to maximize classification accuracy. Our team CNLP-NITS-PP has achieved the 23rd rank in subtask 1 at COLING-2025 for machine-generated text detection in English with a Main Score F1 Macro of 0.6502 and micro-F1 score of 0.6876.