BiTeM at WNUT 2020 Shared Task-1: Named Entity Recognition over Wet Lab Protocols using an Ensemble of Contextual Language Models

Julien Knafou, Nona Naderi, Jenny Copara, Douglas Teodoro, Patrick Ruch


Abstract
Recent improvements in machine-reading technologies attracted much attention to automation problems and their possibilities. In this context, WNUT 2020 introduces a Name Entity Recognition (NER) task based on wet laboratory procedures. In this paper, we present a 3-step method based on deep neural language models that reported the best overall exact match F1-score (77.99%) of the competition. By fine-tuning 10 times, 10 different pretrained language models, this work shows the advantage of having more models in an ensemble based on a majority of votes strategy. On top of that, having 100 different models allowed us to analyse the combinations of ensemble that demonstrated the impact of having multiple pretrained models versus fine-tuning a pretrained model multiple times.
Anthology ID:
2020.wnut-1.40
Volume:
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)
Month:
November
Year:
2020
Address:
Online
Venues:
EMNLP | WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
305–313
Language:
URL:
https://aclanthology.org/2020.wnut-1.40
DOI:
10.18653/v1/2020.wnut-1.40
Bibkey:
Cite (ACL):
Julien Knafou, Nona Naderi, Jenny Copara, Douglas Teodoro, and Patrick Ruch. 2020. BiTeM at WNUT 2020 Shared Task-1: Named Entity Recognition over Wet Lab Protocols using an Ensemble of Contextual Language Models. In Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020), pages 305–313, Online. Association for Computational Linguistics.
Cite (Informal):
BiTeM at WNUT 2020 Shared Task-1: Named Entity Recognition over Wet Lab Protocols using an Ensemble of Contextual Language Models (Knafou et al., WNUT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wnut-1.40.pdf
Data
WNUT 2020