Detecting LLM-Assisted Cheating on Open-Ended Writing Tasks on Language Proficiency Tests

Chenhao Niu; Kevin P. Yancey; Ruidong Liu; Mirza Basim Baig; André Kenji Horie; James Sharpnack

doi:10.18653/v1/2024.emnlp-industry.70

Detecting LLM-Assisted Cheating on Open-Ended Writing Tasks on Language Proficiency Tests

Chenhao Niu, Kevin P. Yancey, Ruidong Liu, Mirza Basim Baig, André Kenji Horie, James Sharpnack

Abstract

The high capability of recent Large Language Models (LLMs) has led to concerns about possible misuse as cheating assistants in open-ended writing tasks in assessments. Although various detecting methods have been proposed, most of them have not been evaluated on or optimized for real-world samples from LLM-assisted cheating, where the generated text is often copy-typed imperfectly by the test-taker. In this paper, we present a framework for training LLM-generated text detectors that can effectively detect LLM-generated samples after being copy-typed. We enhance the existing transformer-based classifier training process with contrastive learning on constructed pairwise data and self-training on unlabeled data, and evaluate the improvements on a real-world dataset from the Duolingo English Test (DET), a high-stakes online English proficiency test. Our experiments demonstrate that the improved model outperforms the original transformer-based classifier and other baselines.

Anthology ID:: 2024.emnlp-industry.70
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:: November
Year:: 2024
Address:: Miami, Florida, US
Editors:: Franck Dernoncourt, Daniel Preoţiuc-Pietro, Anastasia Shimorina
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 940–953
Language:
URL:: https://aclanthology.org/2024.emnlp-industry.70/
DOI:: 10.18653/v1/2024.emnlp-industry.70
Bibkey:
Cite (ACL):: Chenhao Niu, Kevin P. Yancey, Ruidong Liu, Mirza Basim Baig, André Kenji Horie, and James Sharpnack. 2024. Detecting LLM-Assisted Cheating on Open-Ended Writing Tasks on Language Proficiency Tests. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 940–953, Miami, Florida, US. Association for Computational Linguistics.
Cite (Informal):: Detecting LLM-Assisted Cheating on Open-Ended Writing Tasks on Language Proficiency Tests (Niu et al., EMNLP 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.emnlp-industry.70.pdf
Presentation:: 2024.emnlp-industry.70.presentation.pdf
Video:: 2024.emnlp-industry.70.video.mp4

PDF Cite Search Presentation Video Fix data