UNBNLP at SemEval-2021 Task 1: Predicting lexical complexity with masked language models and character-level encoders

Milton King, Ali Hakimi Parizi, Samin Fakharian, Paul Cook


Abstract
In this paper, we present three supervised systems for English lexical complexity prediction of single and multiword expressions for SemEval-2021 Task 1. We explore the use of statistical baseline features, masked language models, and character-level encoders to predict the complexity of a target token in context. Our best system combines information from these three sources. The results indicate that information from masked language models and character-level encoders can be combined to improve lexical complexity prediction.
Anthology ID:
2021.semeval-1.83
Volume:
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP | SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
650–654
Language:
URL:
https://aclanthology.org/2021.semeval-1.83
DOI:
10.18653/v1/2021.semeval-1.83
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.semeval-1.83.pdf