Masaki Eguchi

2024

In this paper, we propose a multimodal dialogue system designed to elicit spontaneous speech samples from second language learners for reliable oral proficiency assessment. The primary challenge in utilizing dialogue systems for language testing lies in obtaining ratable speech samples that demonstrates the user’s full capabilities of interactional skill. To address this, we developed a virtual agent capable of conducting extended interactions, consisting of a 15-minute interview and 10-minute roleplay. The interview component is a system-led dialogue featuring questions that aim to elicit specific language functions from the user. The system dynamically adjusts the topic difficulty based on real-time assessments to provoke linguistic breakdowns as evidence of their upper limit of proficiency. The roleplay component is a mixed-initiative, collaborative conversation aimed at evaluating the user’s interactional competence. Two experiments were conducted to evaluate our system’s reliability in assessing oral proficiency. In experiment 1, we collected a total of 340 interview sessions, 45-72% of which successfully elicited upper linguistic limit by adjusting the topic difficulty levels. In experiment 2, based on the ropleplay dataset of 75 speakers, the interactional speech elicited by our system was found to be as ratable as those by human examiners, indicated by the reliability index of interactional ratings. These results demonstrates that our system can elicit ratable interactional performances comparable to those elicited by human interviewers. Finally, we report on the deployment of our system with over 10,000 university students in a real-world testing scenario.

2023

pdf bib abs
Span Identification of Epistemic Stance-Taking in Academic Written English
Masaki Eguchi | Kristopher Kyle
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

Responding to the increasing need for automated writing evaluation (AWE) systems to assess language use beyond lexis and grammar (Burstein et al., 2016), we introduce a new approach to identify rhetorical features of stance in academic English writing. Drawing on the discourse-analytic framework of engagement in the Appraisal analysis (Martin & White, 2005), we manually annotated 4,688 sentences (126,411 tokens) for eight rhetorical stance categories (e.g., PROCLAIM, ATTRIBUTION) and additional discourse elements. We then report an experiment to train machine learning models to identify and categorize the spans of these stance expressions. The best-performing model (RoBERTa + LSTM) achieved macro-averaged F1 of .7208 in the span identification of stance-taking expressions, slightly outperforming the intercoder reliability estimates before adjudication (F1 = .6629).

2022

pdf bib abs
A Dependency Treebank of Spoken Second Language English
Kristopher Kyle | Masaki Eguchi | Aaron Miller | Theodore Sither
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)

In this paper, we introduce a dependency treebank of spoken second language (L2) English that is annotated with part of speech (Penn POS) tags and syntactic dependencies (Universal Dependencies). We then evaluate the degree to which the use of this treebank as training data affects POS and UD annotation accuracy for L1 web texts, L2 written texts, and L2 spoken texts as compared to models trained on L1 texts only.

Co-authors

Venues

bea2
sigdial1

Fix author