Peter Gerjets


2025

pdf bib
LLM-Human Alignment in Evaluating Teacher Questioning Practices: Beyond Ratings to Explanation
Ruikun Hou | Tim Fütterer | Babette Bühler | Patrick Schreyer | Peter Gerjets | Ulrich Trautwein | Enkelejda Kasneci
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers

This study investigates the alignment between large language models (LLMs) and human raters in assessing teacher questioning practices, moving beyond rating agreement to the evidence selected to justify their decisions. Findings highlight LLMs’ potential to support large-scale classroom observation through interpretable, evidence-based scoring, with possible implications for concrete teacher feedback.

2022

pdf bib
The Pure Poet: How Good is the Subjective Credibility and Stylistic Quality of Literary Short Texts Written with an Artificial Intelligence Tool as Compared to Texts Written by Human Authors?
Vivian Emily Gunser | Steffen Gottschling | Birgit Brucker | Sandra Richter | Dîlan Canan Çakir | Peter Gerjets
Proceedings of the First Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2022)

The application of artificial intelligence (AI) for text generation in creative domains raises questions regarding the credibility of AI-generated content. In two studies, we explored if readers can differentiate between AI-based and human-written texts (generated based on the first line of texts and poems of classic authors) and how the stylistic qualities of these texts are rated. Participants read 9 AI-based continuations and either 9 human-written continuations (Study 1, N=120) or 9 original continuations (Study 2, N=302). Participants’ task was to decide whether a continuation was written with an AI-tool or not, to indicate their confidence in each decision, and to assess the stylistic text quality. Results showed that participants generally had low accuracy for differentiating between text types but were overconfident in their decisions. Regarding the assessment of stylistic quality, AI-continuations were perceived as less well-written, inspiring, fascinating, interesting, and aesthetic than both human-written and original continuations.