Andrii Maslo


2025

pdf bib
BuST: A Siamese Transformer Model for AI Text Detection in Bulgarian
Andrii Maslo | Silvia Gargova
Proceedings of Interdisciplinary Workshop on Observations of Misunderstood, Misguided and Malicious Use of Language Models

We introduce BuST (Bulgarian Siamese Transformer), a novel method for detecting machine-generated Bulgarian text using paraphrase-based semantic similarity. Inspired by the RAIDAR approach, BuST employs a Siamese Transformer architecture to compare input texts with their LLM-generated paraphrases, identifying subtle linguistic patterns that indicate synthetic origin. In pilot experiments, BuST achieved 88.79% accuracy and an F1-score of 88.0%, performing competitively with strong baselines. While BERT reached higher raw scores, BuST offers a model-agnostic and adaptable framework for low-resource settings, demonstrating the promise of paraphrase-driven detection strategies.