Umut Ozbagriacik


2026

We present, to our knowledge, the first systematic transformer-based outlet-ideology classification study for Turkish news. Using a topic-balanced corpus of Turkish political articles drawn from six outlets commonly perceived as left-, centre-, or right-leaning, we formulate a three-way outlet-ideology classification task. On this dataset, we evaluate a monolingual encoder (BERTurk), two multilingual encoders (mBERT, XLM-R), and a LoRA-adapted decoder model (Mistral). BERTurk achieves the best performance among individual models (70% accuracy, 71% macro-F1), reaching levels comparable to English-language studies despite operating in a lower-resource setting. Error analyses show that all encoders reliably distinguish centrist from partisan articles, but frequently confuse left- and right-leaning articles with each other. Moreover, BERTurk is relatively stronger on right-leaning content, whereas the multilingual models favour left-leaning content, suggesting an “ideological fingerprint” of their pre-training data. Crucially, models fine-tuned on an English political-bias task fail to transfer to Turkish, collapsing to near-chance performance. Taken together, these results demonstrate that effective political bias detection requires target-language supervision and cannot be achieved through naïve cross-lingual transfer. Our work establishes a first baseline for Turkish political bias detection and underscores the need for open, carefully designed Turkish (and broader Turkic) bias benchmarks to support robust and fair media analysis.