UoR at SemEval-2021 Task 7: Utilizing Pre-trained DistilBERT Model and Multi-scale CNN for Humor Detection

Zehao Liu, Carl Haines, Huizhi Liang


Abstract
Humour detection is an interesting but difficult task in NLP. Because humorous might not be obvious in text, it can be embedded into context, hide behind the literal meaning and require prior knowledge to understand. We explored different shallow and deep methods to create a humour detection classifier for task 7-1a. Models like Logistic Regression, LSTM, MLP, CNN were used, and pre-trained models like DistilBert were introduced to generate accurate vector representation for textual data. We focused on applying multi-scale strategy on modelling, and compared different models. Our best model is the DistilBERT+MultiScale CNN, it used different sizes of CNN kernel to get multiple scales of features, which achieved 93.7% F1-score and 92.1% accuracy on the test set.
Anthology ID:
2021.semeval-1.166
Volume:
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP | SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1179–1184
Language:
URL:
https://aclanthology.org/2021.semeval-1.166
DOI:
10.18653/v1/2021.semeval-1.166
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.semeval-1.166.pdf