MarSan at SemEval-2022 Task 11: Multilingual complex named entity recognition using T5 and transformer encoder

Ehsan Tavan; Maryam Najafi

doi:10.18653/v1/2022.semeval-1.226

MarSan at SemEval-2022 Task 11: Multilingual complex named entity recognition using T5 and transformer encoder

Abstract

The multilingual complex named entity recognition task of SemEval2020 required participants to detect semantically ambiguous and complex entities in 11 languages. In order to participate in this competition, a deep learning model is being used with the T5 text-to-text language model and its multilingual version, MT5, along with the transformer’s encoder module. The subtoken check has also been introduced, resulting in a 4% increase in the model F1-score in English. We also examined the use of the BPEmb model for converting input tokens to representation vectors in this research. A performance evaluation of the proposed entity detection model is presented at the end of this paper. Six different scenarios were defined, and the proposed model was evaluated in each scenario within the English development set. Our model is also evaluated in other languages.

Anthology ID:: 2022.semeval-1.226
Volume:: Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Month:: July
Year:: 2022
Address:: Seattle, United States
Editors:: Guy Emerson, Natalie Schluter, Gabriel Stanovsky, Ritesh Kumar, Alexis Palmer, Nathan Schneider, Siddharth Singh, Shyam Ratan
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1639–1647
Language:
URL:: https://aclanthology.org/2022.semeval-1.226
DOI:: 10.18653/v1/2022.semeval-1.226
Bibkey:
Cite (ACL):: Ehsan Tavan and Maryam Najafi. 2022. MarSan at SemEval-2022 Task 11: Multilingual complex named entity recognition using T5 and transformer encoder. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 1639–1647, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):: MarSan at SemEval-2022 Task 11: Multilingual complex named entity recognition using T5 and transformer encoder (Tavan & Najafi, SemEval 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.semeval-1.226.pdf
Video:: https://aclanthology.org/2022.semeval-1.226.mp4
Code: marsanteam/complex_ner_semeval
Data: CoNLL 2003, CoNLL++, MultiCoNER

PDF Cite Search Code Video