A Shallow Neural Network for Native Language Identification with Character N-grams

Yunita Sari, Muhammad Rifqi Fatchurrahman, Meisyarah Dwiastuti


Abstract
This paper describes the systems submitted by GadjahMada team to the Native Language Identification (NLI) Shared Task 2017. Our models used a continuous representation of character n-grams which are learned jointly with feed-forward neural network classifier. Character n-grams have been proved to be effective for style-based identification tasks including NLI. Results on the test set demonstrate that the proposed model performs very well on essay and fusion tracks by obtaining more than 0.8 on both F-macro score and accuracy.
Anthology ID:
W17-5027
Volume:
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Joel Tetreault, Jill Burstein, Claudia Leacock, Helen Yannakoudakis
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
249–254
Language:
URL:
https://aclanthology.org/W17-5027/
DOI:
10.18653/v1/W17-5027
Bibkey:
Cite (ACL):
Yunita Sari, Muhammad Rifqi Fatchurrahman, and Meisyarah Dwiastuti. 2017. A Shallow Neural Network for Native Language Identification with Character N-grams. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pages 249–254, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
A Shallow Neural Network for Native Language Identification with Character N-grams (Sari et al., BEA 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-5027.pdf