SB@GU at the Complex Word Identification 2018 Shared Task

David Alfter, Ildikó Pilán


Abstract
In this paper, we describe our experiments for the Shared Task on Complex Word Identification (CWI) 2018 (Yimam et al., 2018), hosted by the 13th Workshop on Innovative Use of NLP for Building Educational Applications (BEA) at NAACL 2018. Our system for English builds on previous work for Swedish concerning the classification of words into proficiency levels. We investigate different features for English and compare their usefulness using feature selection methods. For the German, Spanish and French data we use simple systems based on character n-gram models and show that sometimes simple models achieve comparable results to fully feature-engineered systems.
Anthology ID:
W18-0537
Volume:
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Joel Tetreault, Jill Burstein, Ekaterina Kochmar, Claudia Leacock, Helen Yannakoudakis
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
315–321
Language:
URL:
https://aclanthology.org/W18-0537
DOI:
10.18653/v1/W18-0537
Bibkey:
Cite (ACL):
David Alfter and Ildikó Pilán. 2018. SB@GU at the Complex Word Identification 2018 Shared Task. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 315–321, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
SB@GU at the Complex Word Identification 2018 Shared Task (Alfter & Pilán, BEA 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-0537.pdf