Workshop on Computational Humor (CHum) (2025)


up

pdf (full)
bib (full)
Proceedings of the 1st Workshop on Computational Humor (CHum)

pdf bib
Proceedings of the 1st Workshop on Computational Humor (CHum)
Christian F. Hempelmann | Julia Rayz | Tiansi Dong | Tristan Miller

pdf bib
The Exception of Humor: Iconicity, Phonemic Surprisal, Memory Recall, and Emotional Associations
Alexander Kilpatrick | Maria Flaksman

This meta-study explores the relationships between humor, phonemic bigram surprisal, emotional valence, and memory recall. Prior research indicates that words with higher phonemic surprisal are more readily remembered, suggesting that unpredictable phoneme sequences promote long-term memory recall. Emotional valence is another well-documented factor influencing memory, with negative experiences and stimuli typically being remembered more easily than positive ones. Building on existing findings, this study highlights that words with negative associations often exhibit greater surprisal and are easier to recall. Humor, however, presents an exception: while associated with positive emotions, humorous words also display heightened surprisal and enhanced memorability.

pdf bib
Text Is Not All You Need: Multimodal Prompting Helps LLMs Understand Humor
Ashwin Baluja

While Large Language Models (LLMs) have demonstrated impressive natural language understanding capabilities across various text-based tasks, understanding humor has remained a persistent challenge. Humor is frequently multimodal, relying not only on the meaning of the words, but also their pronunciations, and even the speaker’s intonations. In this study, we explore a simple multimodal prompting approach to humor understanding and explanation. We present an LLM with both the text and the spoken form of a joke, generated using an off-the-shelf text-to-speech (TTS) system. Using multimodal cues improves the explanations of humor compared to textual prompts across all tested datasets.

pdf bib
Rule-based Approaches to the Automatic Generation of Puns Based on Given Names in French
Mathieu Dehouck | Marine Delaborde

Humor is a cornerstone of human interactions. Because puns and word plays lie in the margins of phonology, syntax and semantics, large language models struggle with their generation. In this paper, we present two versions of a tool designed to create a typical kind of French jokes known as “Monsieur et Madame” jokes. We then discuss the main challenges and limitations rule based systems face when creating this kind of puns.

pdf bib
Homophonic Pun Generation in Code Mixed Hindi English
Yash Raj Sarrof

In this study, we investigate Hinglish—a blend of Hindi and English commonly found in informal online communication—with a particular focus on automated pun generation. Our work examines the applicability and adaptability of existing English pun generation pipelines to Hinglish. We assess the pun generation capabilities of Large Language Models (LLMs), particularly GPT-3.5. By employing Chain of Thought prompting and Self-Refine techniques, we identify cross-linguistic homophone detection as a central difficulty. To address this, we propose a novel algorithm for cross-lingual homophone identification and develop a Latin-to-Devanagari transliteration module to leverage the widespread use of Latin-script Hindi in online settings. Building on existing frameworks for pun generation, we incorporate our homophone and transliteration modules to improve output quality. Crowd-sourced human evaluations validate the effectiveness of our approach.

pdf bib
Bridging Laughter Across Languages: Generation of Hindi-English Code-mixed Puns
Likhith Asapu | Prashant Kodali | Ashna Dua | Kapil Rajesh Kavitha | Manish Shrivastava

Puns, as a linguistic phenomenon, hold significant importance in both humor and language comprehension. While extensive research has been conducted in the realm of pun generation in English, there exists a notable gap in the exploration of pun generation within code-mixed text, particularly in Hindi-English code-mixed text. This study addresses this gap by offering a computational method specifically designed to create puns in Hindi-English code-mixed text. In our investigation, we delve into three distinct methodologies aimed at pun generation utilizing pun-alternate word pairs. Furthermore, this novel dataset, HECoP, comprising of 2000 human-annotated sentences serves as a foundational resource for training diverse pun detection models. Additionally, we developed a structured pun generation pipeline capable of generating puns from a single input word without relying on predefined word pairs. Through rigorous human evaluations, our study demonstrates the efficacy of our proposed models in generating code-mixed puns. The findings presented herein lay a solid groundwork for future endeavours in pun generation and computational humor within diverse linguistic contexts.

pdf bib
Testing Humor Theory Using Word and Sentence Embeddings
Stephen Skalicky | Salvatore Attardo

A basic prediction of incongruity theory is that semantic scripts in verbal humor should be in a state of incongruity. We test this prediction using a dataset of 1,182 word/phrase pairs extracted from a set of imperfect puns. Incongruity was defined as the cosine distance between their word vector representations. We compare these pun distances against similarity metrics for the pun words against their synonyms, extracted from WordNet. Results indicate a significantly lower degree of similarity between pun words when compared to their synonyms. Our findings support the basic predictions of incongruity theory and provide computational researchers with a baseline metric to model humorous incongruity.

pdf bib
Pragmatic Metacognitive Prompting Improves LLM Performance on Sarcasm Detection
Joshua Lee | Wyatt Fong | Alexander Le | Sur Shah | Kevin Han | Kevin Zhu

Sarcasm detection is a significant challenge in sentiment analysis due to the nuanced and context-dependent nature of verbiage. We introduce Pragmatic Metacognitive Prompting (PMP) to improve the performance of Large Language Models (LLMs) in sarcasm detection, which leverages principles from pragmatics and reflection helping LLMs interpret implied meanings, consider contextual cues, and reflect on discrepancies to identify sarcasm. Using state-of-the-art LLMs such as LLaMA-3-8B, GPT-4o, and Claude 3.5 Sonnet, PMP achieves state-of-the-art performance on GPT-4o on MUStARD and SemEval2018. This study demonstrates that integrating pragmatic reasoning and metacognitive strategies into prompting significantly enhances LLMs’ ability to detect sarcasm, offering a promising direction for future research in sentiment analysis.

pdf bib
Can AI Make Us Laugh? Comparing Jokes Generated by Witscript and a Human Expert
Joe Toplyn | Ori Amir

This study compares the funniness of AI-generated jokes and those written by a professional human joke writer, using audience laughter as a direct measure. Prior research has typically relied on numerical ratings, which have limitations. Our findings show that AI-generated jokes elicited as much laughter as human-crafted ones, indicating that advanced AI joke generators can now produce original jokes on par with those of a professional human comedy writer.

pdf bib
Evaluating Human Perception and Bias in AI-Generated Humor
Narendra Nath Joshi

This paper explores human perception of AI-generated humor, examining biases and the ability to distinguish between human and AI-created jokes. Through a between-subjects user study involving 174 participants, we tested hypotheses on quality perception, source identification, and demographic influences. Our findings reveal that AI-generated jokes are rated comparably to human-generated ones, with source blindness improving AI humor ratings. Participants struggled to identify AI-generated jokes accurately, and repeated exposure led to increased appreciation. Younger participants showed more favorable perceptions, while technical background had no significant impact. These results challenge preconceptions about AI’s humor capabilities and highlight the importance of addressing biases in AI content evaluation. We also suggest pathways for enhancing human-AI creative collaboration and underscore the need for transparency and ethical considerations in AI-generated content.

pdf bib
The Theater Stage as Laboratory: Review of Real-Time Comedy LLM Systems for Live Performance
Piotr Mirowski | Kory Mathewson | Boyd Branch

In this position paper, we review the eclectic recent history of academic and artistic works involving computational systems for humor generation, and focus specifically on live performance. We make the case that AI comedy should be evaluated in live conditions, in front of audiences sharing either physical or online spaces, and under real-time constraints. We further suggest that improvised comedy is therefore the perfect substrate for deploying and assessing computational humor systems. Using examples of successful AI-infused shows, we demonstrate that live performance raises three sets of challenges for computational humor generation: 1) questions around robotic embodiment, anthropomorphism and competition between humans and machines, 2) questions around comedic timing and the nature of audience interaction, and 3) questions about the human interpretation of seemingly absurd AI-generated humor. We argue that these questions impact the choice of methodologies for evaluating computational humor, as any such method needs to work around the constraints of live audiences and performance spaces. These interrogations also highlight different types of collaborative relationship of human comedians towards AI tools.

pdf bib
The Algorithm is the Message: Computing as a Humor-Generating Mode
Vittorio Marone

This position paper starts from the examination of the “Universal Handbook for Political Speeches,” a satirical manual created during communist Poland as a modular tool to parody propaganda’s rigid linguistic patterns and its absence of meaning, humorously revealing the absurdity of totalitarian “newspeak.” Presented here in English for the first time, the “Handbook” is explored as an analog precursor to computational humor systems. More importantly, this artifact shows that humor, rather than being the product of computing, can also arise from a computationalized, combinatorial structure and process. This shifts the focus on computational algorithms and processes as a mode of humor generation, rather than a tool. That is, computing itself—with its processes, structure, iteration, and combinatorial logic—can be a source of humor, rather than an instrument to fabricate it. The very workings of the machine are what can make us laugh, regardless of what the machine carries or produces. The “Handbook” functions here as a spark for reflection, and hopefully a broader discussion, on how this alternative view may impact the evolution of computational humor and its applications at the dawn of the era of artificial general intelligence.