Yiwei Gu


2022

pdf bib
Automatically Detecting Reduced-formed English Pronunciations by Using Deep Learning
Lei Chen | Chenglin Jiang | Yiwei Gu | Yang Liu | Jiahong Yuan
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)

Reduced form pronunciations are widely used by native English speakers, especially in casual conversations. Second language (L2) learners have difficulty in processing reduced form pronunciations in listening comprehension and face challenges in production too. Meanwhile, training applications dedicated to reduced forms are still few. To solve this issue, we report on our first effort of using deep learning to evaluate L2 learners’ reduced form pronunciations. Compared with a baseline solution that uses an ASR to determine regular or reduced-formed pronunciations, a classifier that learns representative features via a convolution neural network (CNN) on low-level acoustic features, yields higher detection performance. F-1 metric has been increased from $0.690$ to $0.757$ on the reduction task. Furthermore, adding word entities to compute attention weights to better adjust the features learned by the CNN model helps increasing F-1 to $0.763$.