An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution

Tien-Hong Lo; Fu-An Chao; Tzu-I Wu; Yao-Ting Sung; Berlin Chen

doi:10.18653/v1/2024.findings-naacl.86

An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution

Tien-Hong Lo, Fu-An Chao, Tzu-i Wu, Yao-Ting Sung, Berlin Chen

Abstract

Automated speaking assessment (ASA) typically involves automatic speech recognition (ASR) and hand-crafted feature extraction from the ASR transcript of a learner’s speech. Recently, self-supervised learning (SSL) has shown stellar performance compared to traditional methods. However, SSL-based ASA systems are faced with at least three data-related challenges: limited annotated data, uneven distribution of learner proficiency levels and non-uniform score intervals between different CEFR proficiency levels. To address these challenges, we explore the use of two novel modeling strategies: metric-based classification and loss re-weighting, leveraging distinct SSL-based embedding features. Extensive experimental results on the ICNALE benchmark dataset suggest that our approach can outperform existing strong baselines by a sizable margin, achieving a significant improvement of more than 10% in CEFR prediction accuracy.

Anthology ID:: 2024.findings-naacl.86
Volume:: Findings of the Association for Computational Linguistics: NAACL 2024
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Kevin Duh, Helena Gomez, Steven Bethard
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1352–1362
Language:
URL:: https://aclanthology.org/2024.findings-naacl.86
DOI:: 10.18653/v1/2024.findings-naacl.86
Bibkey:
Cite (ACL):: Tien-Hong Lo, Fu-An Chao, Tzu-i Wu, Yao-Ting Sung, and Berlin Chen. 2024. An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 1352–1362, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution (Lo et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-naacl.86.pdf

PDF Cite Search