MAGRET: Machine-generated Text Detection with Rewritten Texts

Yifei Huang, Jiuxin Cao, Hanyu Luo, Xin Guan, Bo Liu


Abstract
With the quick advancement in text generation ability of Large Language Mode(LLM), concerns about the misuse of machine-generated content have grown, raising potential violations of legal and ethical standards. Some existing studies concentrate on detecting machine-generated text in open-source models using in-model features, but their performance on closed-source large models is limited. This limitation occurs because, in the closed-source model detection, the only reference that can be obtained is the texts, which may differ significantly due to random sampling. In this paper, we demonstrate that texts generated by the same model can align both semantically and statistically under similar prompts, facilitating effective detection and traceability. Specifically, we fine-tune a BERT encoder through contrastive learning to achieve semantic alignment in randomly generated texts from the same model. Then, we propose a method called Machine-Generated Text Detection with Rewritten Texts, which designed several prompt refactoring methods and used them to request rewritten text from LLMs. Semantic and statistical relationships between rewritten and original texts provide a basis for detection and traceability. Finally, we expanded the text dataset with multi-parameter random sampling and verified the performance of MAGRET on three text-generated datasets. Experimental results show that previous methods struggle with closed-source model detection, while our approach significantly outperforms baseline methods in this regard. It also shows MagRet’s stable performance in detection and tracing tasks across various randomly sampled texts.
Anthology ID:
2025.coling-main.557
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8336–8346
Language:
URL:
https://aclanthology.org/2025.coling-main.557/
DOI:
Bibkey:
Cite (ACL):
Yifei Huang, Jiuxin Cao, Hanyu Luo, Xin Guan, and Bo Liu. 2025. MAGRET: Machine-generated Text Detection with Rewritten Texts. In Proceedings of the 31st International Conference on Computational Linguistics, pages 8336–8346, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
MAGRET: Machine-generated Text Detection with Rewritten Texts (Huang et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.557.pdf