UACH-INAOE at SMM4H: a BERT based approach for classification of COVID-19 Twitter posts
Alberto Valdes | Jesus Lopez | Manuel Montes
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task
This work describes the participation of the Universidad Autónoma de Chihuahua - Instituto Nacional de Astrofísica, Óptica y Electrónica team at the Social Media Mining for Health Applications (SMM4H) 2021 shared task. Our team participated in task 5 and 6, both focused on the automatic classification of Twitter posts related to COVID-19. Task 5 was oriented on solving a binary classification problem, trying to identify self-reporting tweets of potential cases of COVID-19. Task 6 objective was to classify tweets containing COVID-19 symptoms. For both tasks we used models based on bidirectional encoder representations from transformers (BERT). Our objective was to determine if a model pretrained on a corpus in the domain of interest can outperform one trained on a much larger general domain corpus. Our F1 results were encouraging, 0.77 and 0.95 for task 5 and 6 respectively, having achieved the highest score among all the participants in the latter.