Daniel Alejandro Pérez Alvarez


2018

pdf bib
Complex Word Identification: Convolutional Neural Network vs. Feature Engineering
Segun Taofeek Aroyehun | Jason Angel | Daniel Alejandro Pérez Alvarez | Alexander Gelbukh
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

We describe the systems of NLP-CIC team that participated in the Complex Word Identification (CWI) 2018 shared task. The shared task aimed to benchmark approaches for identifying complex words in English and other languages from the perspective of non-native speakers. Our goal is to compare two approaches: feature engineering and a deep neural network. Both approaches achieved comparable performance on the English test set. We demonstrated the flexibility of the deep-learning approach by using the same deep neural network setup in the Spanish track. Our systems achieved competitive results: all our systems were within 0.01 of the system with the best macro-F1 score on the test sets except on Wikipedia test set, on which our best system is 0.04 below the best macro-F1 score.