Pixel Phantoms at SemEval-2026 Task 13: Exploring Classical and Neural Approaches for AI-Generated Code Detection

Jithu Morrison S; Janani Hariharakrishnan; Angel Deborah S; Rajalakshmi S

Pixel Phantoms at SemEval-2026 Task 13: Exploring Classical and Neural Approaches for AI-Generated Code Detection

Jithu Morrison S, Janani Hariharakrishnan, Angel Deborah S, Rajalakshmi S

Abstract

This paper describes our system for SemEval-2026 Task 13, Subtask A: detecting whether a given code snippet is AI-generated or human-written. We explored a range of approaches from classical machine learning baselines using TF-IDF representations to fine-tuned transformer models pre-trained on code, specifically CodeBERT and GraphCodeBERT. Our experiments revealed a notable degradation in model performance when CodeBERT was trained beyond an optimal number of steps, indicating that continued training within an epoch leads to overfitting or representation drift. GraphCodeBERT, by contrast, yielded our best submission with a macro F1 score of 0.36866. Our findings highlight the sensitivity of code-specific transformers to training duration and suggest that early checkpoint selection is critical for this task.

Anthology ID:: 2026.semeval-1.256
Volume:: Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2040–2045
Language:
URL:: https://aclanthology.org/2026.semeval-1.256/
DOI:
Bibkey:
Cite (ACL):: Jithu Morrison S, Janani Hariharakrishnan, Angel Deborah S, and Rajalakshmi S. 2026. Pixel Phantoms at SemEval-2026 Task 13: Exploring Classical and Neural Approaches for AI-Generated Code Detection. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 2040–2045, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: Pixel Phantoms at SemEval-2026 Task 13: Exploring Classical and Neural Approaches for AI-Generated Code Detection (S et al., SemEval 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.semeval-1.256.pdf
Supplementarymaterial:: 2026.semeval-1.256.SupplementaryMaterial.zip

PDF Cite Search Supplementarymaterial Fix data