Advancing LLM detection in the ALTA 2024 Shared Task: Techniques and Analysis

Dima Galat


Abstract
The recent proliferation of AI-generated content has prompted significant interest in developing reliable detection methods. This study explores techniques for identifying AIgenerated text through sentence-level evaluation within hybrid articles. Our findings indicate that ChatGPT-3.5 Turbo exhibits distinct, repetitive probability patterns that enable consistent in-domain detection. Empirical tests show that minor textual modifications, such as rewording, have minimal impact on detection accuracy. These results provide valuable insights for advancing AI detection methodologies, offering a pathway toward robust solutions to address the complexities of synthetic text identification.
Anthology ID:
2024.alta-1.18
Volume:
Proceedings of the 22nd Annual Workshop of the Australasian Language Technology Association
Month:
December
Year:
2024
Address:
Canberra, Australia
Editors:
Tim Baldwin, Sergio José Rodríguez Méndez, Nicholas Kuo
Venue:
ALTA
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
203–206
Language:
URL:
https://aclanthology.org/2024.alta-1.18/
DOI:
Bibkey:
Cite (ACL):
Dima Galat. 2024. Advancing LLM detection in the ALTA 2024 Shared Task: Techniques and Analysis. In Proceedings of the 22nd Annual Workshop of the Australasian Language Technology Association, pages 203–206, Canberra, Australia. Association for Computational Linguistics.
Cite (Informal):
Advancing LLM detection in the ALTA 2024 Shared Task: Techniques and Analysis (Galat, ALTA 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.alta-1.18.pdf