C3SL at SemEval-2021 Task 1: Predicting Lexical Complexity of Words in Specific Contexts with Sentence Embeddings

Raul Almeida, Hegler Tissot, Marcos Didonet Del Fabro


Abstract
We present our approach to predicting lexical complexity of words in specific contexts, as entered LCP Shared Task 1 at SemEval 2021. The approach consists of separating sentences into smaller chunks, embedding them with Sent2Vec, and reducing the embeddings into a simpler vector used as input to a neural network, the latter for predicting the complexity of words and expressions. Results show that the pre-trained sentence embeddings are not able to capture lexical complexity from the language when applied in cross-domain applications.
Anthology ID:
2021.semeval-1.88
Volume:
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Month:
August
Year:
2021
Address:
Online
Editors:
Alexis Palmer, Nathan Schneider, Natalie Schluter, Guy Emerson, Aurelie Herbelot, Xiaodan Zhu
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
683–687
Language:
URL:
https://aclanthology.org/2021.semeval-1.88
DOI:
10.18653/v1/2021.semeval-1.88
Bibkey:
Cite (ACL):
Raul Almeida, Hegler Tissot, and Marcos Didonet Del Fabro. 2021. C3SL at SemEval-2021 Task 1: Predicting Lexical Complexity of Words in Specific Contexts with Sentence Embeddings. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 683–687, Online. Association for Computational Linguistics.
Cite (Informal):
C3SL at SemEval-2021 Task 1: Predicting Lexical Complexity of Words in Specific Contexts with Sentence Embeddings (Almeida et al., SemEval 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.semeval-1.88.pdf