Native Language Identification on Text and Speech

Marcos Zampieri, Alina Maria Ciobanu, Liviu P. Dinu


Abstract
This paper presents an ensemble system combining the output of multiple SVM classifiers to native language identification (NLI). The system was submitted to the NLI Shared Task 2017 fusion track which featured students essays and spoken responses in form of audio transcriptions and iVectors by non-native English speakers of eleven native languages. Our system competed in the challenge under the team name ZCD and was based on an ensemble of SVM classifiers trained on character n-grams achieving 83.58% accuracy and ranking 3rd in the shared task.
Anthology ID:
W17-5045
Volume:
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Joel Tetreault, Jill Burstein, Claudia Leacock, Helen Yannakoudakis
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
398–404
Language:
URL:
https://aclanthology.org/W17-5045/
DOI:
10.18653/v1/W17-5045
Bibkey:
Cite (ACL):
Marcos Zampieri, Alina Maria Ciobanu, and Liviu P. Dinu. 2017. Native Language Identification on Text and Speech. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pages 398–404, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Native Language Identification on Text and Speech (Zampieri et al., BEA 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-5045.pdf