You Told Me That Joke Twice: A Systematic Investigation of Transferability and Robustness of Humor Detection Models

Alexander Baranov, Vladimir Kniazhevsky, Pavel Braslavski


Abstract
In this study, we focus on automatic humor detection, a highly relevant task for conversational AI. To date, there are several English datasets for this task, but little research on how models trained on them generalize and behave in the wild. To fill this gap, we carefully analyze existing datasets, train RoBERTa-based and Naïve Bayes classifiers on each of them, and test on the rest. Training and testing on the same dataset yields good results, but the transferability of the models varies widely. Models trained on datasets with jokes from different sources show better transferability, while the amount of training data has a smaller impact. The behavior of the models on out-of-domain data is unstable, suggesting that some of the models overfit, while others learn non-specific humor characteristics. An adversarial attack shows that models trained on pun datasets are less robust. We also evaluate the sense of humor of the chatGPT and Flan-UL2 models in a zero-shot scenario. The LLMs demonstrate competitive results on humor datasets and a more stable behavior on out-of-domain data. We believe that the obtained results will facilitate the development of new datasets and evaluation methodologies in the field of computational humor. We’ve made all the data from the study and the trained models publicly available at https://github.com/Humor-Research/Humor-detection.
Anthology ID:
2023.emnlp-main.845
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13701–13715
Language:
URL:
https://aclanthology.org/2023.emnlp-main.845
DOI:
10.18653/v1/2023.emnlp-main.845
Bibkey:
Cite (ACL):
Alexander Baranov, Vladimir Kniazhevsky, and Pavel Braslavski. 2023. You Told Me That Joke Twice: A Systematic Investigation of Transferability and Robustness of Humor Detection Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 13701–13715, Singapore. Association for Computational Linguistics.
Cite (Informal):
You Told Me That Joke Twice: A Systematic Investigation of Transferability and Robustness of Humor Detection Models (Baranov et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-main.845.pdf
Video:
 https://aclanthology.org/2023.emnlp-main.845.mp4