Authorship Attribution in Multilingual Machine-Generated Texts

Lucio La Cava; Dominik Macko; Robert Moro; Ivan Srba; Andrea Tagarelli

Authorship Attribution in Multilingual Machine-Generated Texts

Lucio La Cava, Dominik Macko, Robert Moro, Ivan Srba, Andrea Tagarelli

Abstract

As Large Language Models (LLMs) have reached human-like fluency and coherence, distinguishing machine-generated text (MGT) from human-written content becomes increasingly difficult. While early efforts in MGT detection have focused on binary classification, the growing landscape and diversity of LLMs require a more fine-grained yet challenging authorship attribution (AA), i.e., being able to identify the precise generator (LLM or human) behind a text. However, AA remains nowadays confined to a monolingual setting, with English being the most investigated one, overlooking the multilingual nature and usage of modern LLMs. In this work, we introduce the problem of Multilingual Authorship Attribution, which involves attributing texts to human or multiple LLM generators across diverse languages. Focusing on 18 languages—covering multiple families and writing scripts—and 8 generators (7 LLMs and the human-authored class), we investigate the multilingual suitability of monolingual AA methods in terms of their cross-lingual transferability, and the impact of generators on attribution performance. Our results reveal that while certain monolingual AA methods can be adapted to multilingual settings, significant limitations and challenges remain, particularly in transferring across diverse language families, underscoring the complexity of multilingual AA and the need for more robust approaches to better match real-world scenarios.

Anthology ID:: 2026.acl-long.2091
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 45136–45152
Language:
URL:: https://aclanthology.org/2026.acl-long.2091/
DOI:
Bibkey:
Cite (ACL):: Lucio La Cava, Dominik Macko, Robert Moro, Ivan Srba, and Andrea Tagarelli. 2026. Authorship Attribution in Multilingual Machine-Generated Texts. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 45136–45152, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Authorship Attribution in Multilingual Machine-Generated Texts (La Cava et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.2091.pdf
Checklist:: 2026.acl-long.2091.checklist.pdf

PDF Cite Search Checklist Fix data