Cross-corpora experiments of automatic proficiency assessment and error detection for spoken English

Stefano Bannò, Marco Matassoni


Abstract
The growing demand for learning English as a second language has led to an increasing interest in automatic approaches for assessing spoken language proficiency. One of the most significant challenges in this field is the lack of publicly available annotated spoken data. Another common issue is the lack of consistency and coherence in human assessment. To tackle both problems, in this paper we address the task of automatically predicting the scores of spoken test responses of English-as-a-second-language learners by training neural models on written data and using the presence of grammatical errors as a feature, as they can be considered consistent indicators of proficiency through their distribution and frequency. Specifically, we train a feature extractor on EFCAMDAT, a large written corpus containing error annotations and proficiency levels assigned by human experts, in order to extract information related to grammatical errors and, in turn, we use the resulting model for inference on the CLC-FCE corpus, on the ICNALE corpus, and on the spoken section of the TLT-school corpus, a collection of proficiency tests taken by Italian students. The work investigates the impact of the feature extractor on spoken proficiency assessment as well as the written-to-spoken approach. We find that our error-based approach can be beneficial for assessing spoken proficiency. The results obtained on the considered datasets are discussed and evaluated with appropriate metrics.
Anthology ID:
2022.bea-1.12
Volume:
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)
Month:
July
Year:
2022
Address:
Seattle, Washington
Editors:
Ekaterina Kochmar, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Nitin Madnani, Anaïs Tack, Victoria Yaneva, Zheng Yuan, Torsten Zesch
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
82–91
Language:
URL:
https://aclanthology.org/2022.bea-1.12
DOI:
10.18653/v1/2022.bea-1.12
Bibkey:
Cite (ACL):
Stefano Bannò and Marco Matassoni. 2022. Cross-corpora experiments of automatic proficiency assessment and error detection for spoken English. In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022), pages 82–91, Seattle, Washington. Association for Computational Linguistics.
Cite (Informal):
Cross-corpora experiments of automatic proficiency assessment and error detection for spoken English (Bannò & Matassoni, BEA 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.bea-1.12.pdf
Video:
 https://aclanthology.org/2022.bea-1.12.mp4