EunJeong Hwang


2024

pdf bib
Proceedings of the 1st Workshop on Personalization of Generative AI Systems (PERSONALIZE 2024)
Ameet Deshpande | EunJeong Hwang | Vishvak Murahari | Joon Sung Park | Diyi Yang | Ashish Sabharwal | Karthik Narasimhan | Ashwin Kalyan
Proceedings of the 1st Workshop on Personalization of Generative AI Systems (PERSONALIZE 2024)

2023

pdf bib
Knowledge Graph Compression Enhances Diverse Commonsense Generation
EunJeong Hwang | Veronika Thost | Vered Shwartz | Tengfei Ma
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Generating commonsense explanations requires reasoning about commonsense knowledge beyond what is explicitly mentioned in the context. Existing models use commonsense knowledge graphs such as ConceptNet to extract a subgraph of relevant knowledge pertaining to concepts in the input. However, due to the large coverage and, consequently, vast scale of ConceptNet, the extracted subgraphs may contain loosely related, redundant and irrelevant information, which can introduce noise into the model. We propose to address this by applying a differentiable graph compression algorithm that focuses on the relevant knowledge for the task. The compressed subgraphs yield considerably more diverse outputs when incorporated into models for the tasks of generating commonsense and abductive explanations. Moreover, our model achieves better quality-diversity tradeoff than a large language model with 100 times the number of parameters. Our generic approach can be applied to additional NLP tasks that can benefit from incorporating external knowledge.

pdf bib
MemeCap: A Dataset for Captioning and Interpreting Memes
EunJeong Hwang | Vered Shwartz
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Memes are a widely popular tool for web users to express their thoughts using visual metaphors. Understanding memes requires recognizing and interpreting visual metaphors with respect to the text inside or around the meme, often while employing background knowledge and reasoning abilities. We present the task of meme captioning and release a new dataset, MemeCap. Our dataset contains 6.3K memes along with the title of the post containing the meme, the meme captions, the literal image caption, and the visual metaphors. Despite the recent success of vision and language (VL) models on tasks such as image captioning and visual question answering, our extensive experiments using state-of-the-art VL models show that they still struggle with visual metaphors, and perform substantially worse than humans.

pdf bib
Aligning Language Models to User Opinions
EunJeong Hwang | Bodhisattwa Majumder | Niket Tandon
Findings of the Association for Computational Linguistics: EMNLP 2023

An important aspect of developing LLMs that interact with humans is to align models’ behavior to their users. It is possible to prompt an LLM into behaving as a certain persona, especially a user group or ideological persona the model captured during its pertaining stage. But, how to best align an LLM with a specific user and not a demographic or ideological group remains an open question. Mining public opinion surveys (by PEW research), we find that the opinions of a user and their demographics and ideologies are not mutual predictors. We use this insight to align LLMs by modeling relevant past user opinions in addition to user demographics and ideology, achieving up to 7 points accuracy gains in predicting public opinions from survey questions across a broad set of topics. Our work opens up the research avenues to bring user opinions as an important ingredient in aligning language models.

2022

pdf bib
Event-Event Relation Extraction using Probabilistic Box Embedding
EunJeong Hwang | Jay-Yoon Lee | Tianyi Yang | Dhruvesh Patel | Dongxu Zhang | Andrew McCallum
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

To understand a story with multiple events, it is important to capture the proper relations across these events. However, existing event relation extraction (ERE) framework regards it as a multi-class classification task and do not guarantee any coherence between different relation types, such as anti-symmetry. If a phone line “died” after “storm”, then it is obvious that the “storm” happened before the “died”. Current framework of event relation extraction do not guarantee this coherence and thus enforces it via constraint loss function (Wang et al., 2020). In this work, we propose to modify the underlying ERE model to guarantee coherence by representing each event as a box representation (BERE) without applying explicit constraints. From our experiments, BERE also shows stronger conjunctive constraint satisfaction while performing on par or better in F1 compared to previous models with constraint injection.