Zijie Zeng

2024

Overview of the 2024 ALTA Shared Task: Detect Automatic AI-Generated Sentences for Human-AI Hybrid Articles
Diego Mollá | Qiongkai Xu | Zijie Zeng | Zhuang Li
Proceedings of the 22nd Annual Workshop of the Australasian Language Technology Association

The ALTA shared tasks have been running annually since 2010. In 2024, the purpose of the task is to detect machine-generated text in a hybrid setting where the text may contain portions of human text and portions machine-generated. In this paper, we present the task, the evaluation criteria, and the results of the systems participating in the shared task.

2022

pdf bib abs

Do Deep Neural Nets Display Human-like Attention in Short Answer Scoring?
Zijie Zeng | Xinyu Li | Dragan Gasevic | Guanliang Chen
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Deep Learning (DL) techniques have been increasingly adopted for Automatic Text Scoring in education. However, these techniques often suffer from their inabilities to explain and justify how a prediction is made, which, unavoidably, decreases their trustworthiness and hinders educators from embracing them in practice. This study aimed to investigate whether (and to what extent) DL-based graders align with human graders regarding the important words they identify when marking short answer questions. To this end, we first conducted a user study to ask human graders to manually annotate important words in assessing answer quality and then measured the overlap between these human-annotated words and those identified by DL-based graders (i.e., those receiving large attention weights). Furthermore, we ran a randomized controlled experiment to explore the impact of highlighting important words detected by DL-based graders on human grading. The results showed that: (i) DL-based graders, to a certain degree, displayed alignment with human graders no matter whether DL-based graders and human graders agreed on the quality of an answer; and (ii) it is possible to facilitate human grading by highlighting those DL-detected important words, though further investigations are necessary to understand how human graders exploit such highlighted words.

Co-authors

Qiongkai Xu 1

Venues

ALTA1
NAACL1

Fix author