An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution

Tien-Hong Lo, Fu-An Chao, Tzu-i Wu, Yao-Ting Sung, Berlin Chen


Abstract
Automated speaking assessment (ASA) typically involves automatic speech recognition (ASR) and hand-crafted feature extraction from the ASR transcript of a learner’s speech. Recently, self-supervised learning (SSL) has shown stellar performance compared to traditional methods. However, SSL-based ASA systems are faced with at least three data-related challenges: limited annotated data, uneven distribution of learner proficiency levels and non-uniform score intervals between different CEFR proficiency levels. To address these challenges, we explore the use of two novel modeling strategies: metric-based classification and loss re-weighting, leveraging distinct SSL-based embedding features. Extensive experimental results on the ICNALE benchmark dataset suggest that our approach can outperform existing strong baselines by a sizable margin, achieving a significant improvement of more than 10% in CEFR prediction accuracy.
Anthology ID:
2024.findings-naacl.86
Volume:
Findings of the Association for Computational Linguistics: NAACL 2024
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1352–1362
Language:
URL:
https://aclanthology.org/2024.findings-naacl.86
DOI:
Bibkey:
Cite (ACL):
Tien-Hong Lo, Fu-An Chao, Tzu-i Wu, Yao-Ting Sung, and Berlin Chen. 2024. An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 1352–1362, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution (Lo et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-naacl.86.pdf