Automatically Detecting Reduced-formed English Pronunciations by Using Deep Learning

Lei Chen, Chenglin Jiang, Yiwei Gu, Yang Liu, Jiahong Yuan


Abstract
Reduced form pronunciations are widely used by native English speakers, especially in casual conversations. Second language (L2) learners have difficulty in processing reduced form pronunciations in listening comprehension and face challenges in production too. Meanwhile, training applications dedicated to reduced forms are still few. To solve this issue, we report on our first effort of using deep learning to evaluate L2 learners’ reduced form pronunciations. Compared with a baseline solution that uses an ASR to determine regular or reduced-formed pronunciations, a classifier that learns representative features via a convolution neural network (CNN) on low-level acoustic features, yields higher detection performance. F-1 metric has been increased from $0.690$ to $0.757$ on the reduction task. Furthermore, adding word entities to compute attention weights to better adjust the features learned by the CNN model helps increasing F-1 to $0.763$.
Anthology ID:
2022.bea-1.4
Volume:
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)
Month:
July
Year:
2022
Address:
Seattle, Washington
Venues:
BEA | NAACL
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
22–26
Language:
URL:
https://aclanthology.org/2022.bea-1.4
DOI:
10.18653/v1/2022.bea-1.4
Bibkey:
Cite (ACL):
Lei Chen, Chenglin Jiang, Yiwei Gu, Yang Liu, and Jiahong Yuan. 2022. Automatically Detecting Reduced-formed English Pronunciations by Using Deep Learning. In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022), pages 22–26, Seattle, Washington. Association for Computational Linguistics.
Cite (Informal):
Automatically Detecting Reduced-formed English Pronunciations by Using Deep Learning (Chen et al., BEA 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.bea-1.4.pdf