Native Language Identification using Phonetic Algorithms

Charese Smiley, Sandra Kübler


Abstract
In this paper, we discuss the results of the IUCL system in the NLI Shared Task 2017. For our system, we explore a variety of phonetic algorithms to generate features for Native Language Identification. These features are contrasted with one of the most successful type of features in NLI, character n-grams. We find that although phonetic features do not perform as well as character n-grams alone, they do increase overall F1 score when used together with character n-grams.
Anthology ID:
W17-5046
Volume:
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Joel Tetreault, Jill Burstein, Claudia Leacock, Helen Yannakoudakis
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
405–412
Language:
URL:
https://aclanthology.org/W17-5046
DOI:
10.18653/v1/W17-5046
Bibkey:
Cite (ACL):
Charese Smiley and Sandra Kübler. 2017. Native Language Identification using Phonetic Algorithms. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pages 405–412, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Native Language Identification using Phonetic Algorithms (Smiley & Kübler, BEA 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-5046.pdf