Weimin Yuan


2022

pdf bib
zydhjh4593@SMM4H’22: A Generic Pre-trained BERT-based Framework for Social Media Health Text Classification
Chenghao Huang | Xiaolu Chen | Yuxi Chen | Yutong Wu | Weimin Yuan | Yan Wang | Yanru Zhang
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task

This paper describes our proposed framework for the 10 text classification tasks of Task 1a, 2a, 2b, 3a, 4, 5, 6, 7, 8, and 9, in the Social Media Mining for Health (SMM4H) 2022. According to the pre-trained BERT-based models, various techniques, including regularized dropout, focal loss, exponential moving average, 5-fold cross-validation, ensemble prediction, and pseudo-labeling, are applied for further formulating and improving the generalization performance of our framework. In the evaluation, the proposed framework achieves the 1st place in Task 3a with a 7% higher F1-score than the median, and obtains a 4% higher averaged F1-score than the median in all participating tasks except Task 1a.