Unsupervised Detection of LLM-Generated Polish Text Using Perplexity Difference

Krzysztof Wróbel


Abstract
Inspired by zero-shot detection methods that compare perplexity across model pairs, we investigate whether computing perplexity differences on whole-text character-level perplexity can effectively detect LLM-generated Polish text. Unlike token-level ratio methods that require compatible tokenizers, our approach enables pairing any models regardless of tokenization. Through systematic evaluation of 91 model pairs on the PolEval 2025 ŚMIGIEL shared task, we identify Gemma-3-27B and PLLuM-12B as optimal, achieving 81.22% accuracy on test data with unseen generators. Our difference-based approach outperforms token-level ratio methods (+5.5pp) and single-model baselines (+8.3pp) without using training labels, capturing asymmetric reactions where human text causes greater perplexity divergence than LLM text. We demonstrate that complementary model pairing (multilingual + monolingual) and architectural quality matter more than raw model size for this task.
Anthology ID:
2025.poleval-main.5
Volume:
Proceedings of the PolEval 2025 Workshop
Month:
November
Year:
2025
Address:
Warsaw
Editors:
Łukasz Kobyliński, Alina Wróblewska, Maciej Ogrodniczuk
Venues:
PolEval | WS
SIG:
Publisher:
Institute of Computer Science PAS and Association for Computational Linguistics
Note:
Pages:
26–38
Language:
URL:
https://aclanthology.org/2025.poleval-main.5/
DOI:
Bibkey:
Cite (ACL):
Krzysztof Wróbel. 2025. Unsupervised Detection of LLM-Generated Polish Text Using Perplexity Difference. In Proceedings of the PolEval 2025 Workshop, pages 26–38, Warsaw. Institute of Computer Science PAS and Association for Computational Linguistics.
Cite (Informal):
Unsupervised Detection of LLM-Generated Polish Text Using Perplexity Difference (Wróbel, PolEval 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.poleval-main.5.pdf