CAMB at CWI Shared Task 2018: Complex Word Identification with Ensemble-Based Voting

Sian Gooding, Ekaterina Kochmar


Abstract
This paper presents the winning systems we submitted to the Complex Word Identification Shared Task 2018. We describe our best performing systems’ implementations and discuss our key findings from this research. Our best-performing systems achieve an F1 score of 0.8792 on the NEWS, 0.8430 on the WIKINEWS and 0.8115 on the WIKIPEDIA test sets in the monolingual English binary classification track, and a mean absolute error of 0.0558 on the NEWS, 0.0674 on the WIKINEWS and 0.0739 on the WIKIPEDIA test sets in the probabilistic track.
Anthology ID:
W18-0520
Volume:
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Joel Tetreault, Jill Burstein, Ekaterina Kochmar, Claudia Leacock, Helen Yannakoudakis
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
184–194
Language:
URL:
https://aclanthology.org/W18-0520
DOI:
10.18653/v1/W18-0520
Bibkey:
Cite (ACL):
Sian Gooding and Ekaterina Kochmar. 2018. CAMB at CWI Shared Task 2018: Complex Word Identification with Ensemble-Based Voting. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 184–194, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
CAMB at CWI Shared Task 2018: Complex Word Identification with Ensemble-Based Voting (Gooding & Kochmar, BEA 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-0520.pdf