Yue-Yang He


2025

pdf bib
Leveraging Weak Segment Labels for Robust Automated Speaking Assessment in Read-Aloud Tasks
Yue-Yang He | Berlin Chen
Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)

Automated speaking assessment (ASA) has become a crucial component in computer-assisted language learning, providing scalable, objective, and timely feedback to second-language learners. While early ASA systems relied on hand-crafted features and shallow classifiers, recent advances in self-supervised learning (SSL) have enabled richer representations for both text and speech, improving assessment accuracy. Despite these advances, challenges remain in evaluating long speech responses, due to limited labeled data, class imbalance, and the importance of pronunciation clarity and fluency, especially for read-aloud tasks. In this work, we propose a segment-based ASA framework leveraging WhisperX to split long responses into shorter fragments, generate weak labels from holistic scores, and aggregate segment-level predictions to obtain final proficiency scores. Experiments on the GEPT corpus demonstrate that our framework outperforms baseline holistic models, generalizes robustly to unseen prompts and speakers, and provides diagnostic insights at both segment and response levels.

2023

pdf bib
KNOT-MCTS: An Effective Approach to Addressing Hallucinations in Generative Language Modeling for Question Answering
Chung-Wen Wu | Guan-Tang Huang | Yue-Yang He | Berlin Chen
Proceedings of the 35th Conference on Computational Linguistics and Speech Processing (ROCLING 2023)