Pan He


2023

pdf bib
Zhegu at SemEval-2023 Task 9: Exponential Penalty Mean Squared Loss for Multilingual Tweet Intimacy Analysis
Pan He | Yanru Zhang
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

We present the system description of our team Zhegu in SemEval-2023 Task 9 Multilingual Tweet Intimacy Analysis. We propose \textbf{EPM} (\textbf{E}xponential \textbf{P}enalty \textbf{M}ean Squared Loss) for the purpose of enhancing the ability of learning difficult samples during the training process. Meanwhile, we also apply several methods (frozen Tuning \& contrastive learning based on Language) on the XLM-R multilingual language model for fine-tuning and model ensemble. The results in our experiments provide strong faithful evidence of the effectiveness of our methods. Eventually, we achieved a Pearson score of 0.567 on the test set.

2022

pdf bib
Zhegu@SMM4H-2022: The Pre-training Tweet & Claim Matching Makes Your Prediction Better
Pan He | Chen YuZe | Yanru Zhang
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task

SMM4H-2022 (CITATION) Task 2 is to detect whether containing premise in the tweets of users about COVID-19 on the social medias or their stances for the claims. In this paper, we propose Tweet Claim Matching (TCM), which is a new pre-training task constructed by the tweets and claims similarly to Next Sentence Prediction (NSP). We first continue to pre-train the standard pre-trained language models on the labelled dataset and then fine-tune them for obtaining better performance. Compared with the solid baseline (CITATION), we achieve the absolute improvement of 7.9% in Task 2a and obtain the SOTA results.