CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way

Greta Smolenska, Peter Kolb, Sinan Tang, Mironas Bitinis, Héctor Hernández, Elin Asklöv


Abstract
This paper presents the system we submitted to the first Lexical Complexity Prediction (LCP) Shared Task 2021. The Shared Task provides participants with a new English dataset that includes context of the target word. We participate in the single-word complexity prediction sub-task and focus on feature engineering. Our best system is trained on linguistic features and word embeddings (Pearson’s score of 0.7942). We demonstrate, however, that a simpler feature set achieves comparable results and submit a model trained on 36 linguistic features (Pearson’s score of 0.7925).
Anthology ID:
2021.semeval-1.81
Volume:
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP | SemEval
SIGs:
SIGLEX | SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
632–639
Language:
URL:
https://aclanthology.org/2021.semeval-1.81
DOI:
10.18653/v1/2021.semeval-1.81
Bibkey:
Cite (ACL):
Greta Smolenska, Peter Kolb, Sinan Tang, Mironas Bitinis, Héctor Hernández, and Elin Asklöv. 2021. CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 632–639, Online. Association for Computational Linguistics.
Cite (Informal):
CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way (Smolenska et al., SemEval 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.semeval-1.81.pdf