The Erosion of LLM Signatures: Can We Still Distinguish Human and LLM-Generated Scientific Ideas after Iterative Paraphrasing?

Sadat Shahriar; Navid Ayoobi; Arjun Mukherjee

The Erosion of LLM Signatures: Can We Still Distinguish Human and LLM-Generated Scientific Ideas after Iterative Paraphrasing?

Sadat Shahriar, Navid Ayoobi, Arjun Mukherjee

Abstract

With the increasing reliance on LLMs as research agents, distinguishing between LLM and human-generated ideas has become crucial for understanding the cognitive nuances of LLMs’ research capabilities. While detecting LLM-generated text has been extensively studied, distinguishing human vs LLM-generated *scientific ideas* remains an unexplored area. In this work, we systematically evaluate the ability of state-of-the-art (SOTA) machine learning models to differentiate between human and LLM-generated ideas, particularly after successive paraphrasing stages. Our findings highlight the challenges SOTA models face in source attribution, with detection performance declining by an average of 25.4% after five consecutive paraphrasing stages. Additionally, we demonstrate that incorporating the research problem as contextual information improves detection performance by up to 2.97%. Notably, our analysis reveals that detection algorithms struggle significantly when ideas are paraphrased into a simplified, non-expert style, contributing the most to the erosion of distinguishable LLM signatures.

Anthology ID:: 2025.ranlp-1.129
Volume:: Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:: September
Year:: 2025
Address:: Varna, Bulgaria
Editors:: Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:: 1118–1126
Language:
URL:: https://aclanthology.org/2025.ranlp-1.129/
DOI:
Bibkey:
Cite (ACL):: Sadat Shahriar, Navid Ayoobi, and Arjun Mukherjee. 2025. The Erosion of LLM Signatures: Can We Still Distinguish Human and LLM-Generated Scientific Ideas after Iterative Paraphrasing?. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 1118–1126, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):: The Erosion of LLM Signatures: Can We Still Distinguish Human and LLM-Generated Scientific Ideas after Iterative Paraphrasing? (Shahriar et al., RANLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.ranlp-1.129.pdf

PDF Cite Search Fix data