ICI Innolabs at SemEval-2026 Task 13: Sliding Windows Meet Code Transformers

Sebastian Balmus; Bogdan Dura

ICI Innolabs at SemEval-2026 Task 13: Sliding Windows Meet Code Transformers

Abstract

We describe our system for SemEval-2026 Task 13, Subtask B, which focuses on multi-class authorship attribution for code: given a code snippet, the goal is to predict whether it is human-written or generated by one of ten LLM families. The task presents two central challenges: severe class imbalance and long input sequences that frequently exceed the context length of encoder-based Transformers. To address these issues, we adopt a window-based fine-tuning and inference framework. During training, we randomly sample 512-token windows from each snippet and optimize a class-weighted cross-entropy objective with label smoothing. At inference time, we apply a sliding-window strategy and aggregate window-level logits to obtain a snippet-level prediction. We fine-tune three pretrained code encoders (CodeBERT, UniXcoder, and StarEncoder) under this framework and combine their outputs via majority voting. On the official validation split, our best single model (StarEncoder) achieves 0.60 macro F1. On the final test set, the three-model ensemble reaches 0.41 macro F1, ranking 10th on the leaderboard. Our results demonstrate that window-based modeling combined with imbalance-aware optimization provides a robust and reproducible baseline for multi-class LLM attribution under distribution shift.

Anthology ID:: 2026.semeval-1.409
Volume:: Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3274–3279
Language:
URL:: https://aclanthology.org/2026.semeval-1.409/
DOI:
Bibkey:
Cite (ACL):: Sebastian Balmus and Bogdan Dura. 2026. ICI Innolabs at SemEval-2026 Task 13: Sliding Windows Meet Code Transformers. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 3274–3279, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: ICI Innolabs at SemEval-2026 Task 13: Sliding Windows Meet Code Transformers (Balmus & Dura, SemEval 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.semeval-1.409.pdf
Supplementarymaterial:: 2026.semeval-1.409.SupplementaryMaterial.zip

PDF Cite Search Supplementarymaterial Fix data