Experiments with Universal CEFR Classification

Sowmya Vajjala, Taraka Rama


Abstract
The Common European Framework of Reference (CEFR) guidelines describe language proficiency of learners on a scale of 6 levels. While the description of CEFR guidelines is generic across languages, the development of automated proficiency classification systems for different languages follow different approaches. In this paper, we explore universal CEFR classification using domain-specific and domain-agnostic, theory-guided as well as data-driven features. We report the results of our preliminary experiments in monolingual, cross-lingual, and multilingual classification with three languages: German, Czech, and Italian. Our results show that both monolingual and multilingual models achieve similar performance, and cross-lingual classification yields lower, but comparable results to monolingual classification.
Anthology ID:
W18-0515
Volume:
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Joel Tetreault, Jill Burstein, Ekaterina Kochmar, Claudia Leacock, Helen Yannakoudakis
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
147–153
Language:
URL:
https://aclanthology.org/W18-0515/
DOI:
10.18653/v1/W18-0515
Bibkey:
Cite (ACL):
Sowmya Vajjala and Taraka Rama. 2018. Experiments with Universal CEFR Classification. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 147–153, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
Experiments with Universal CEFR Classification (Vajjala & Rama, BEA 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-0515.pdf
Code
 nishkalavallabhi/UniversalCEFRScoring
Data
Universal Dependencies