Jaewook Lee


2024

pdf bib
Generative Interpretation: Toward Human-Like Evaluation for Educational Question-Answer Pair Generation
Hyeonseok Moon | Jaewook Lee | Sugyeong Eo | Chanjun Park | Jaehyung Seo | Heuiseok Lim
Findings of the Association for Computational Linguistics: EACL 2024

Educational question-answer generation has been extensively researched owing to its practical applicability. However, we have identified a persistent challenge concerning the evaluation of such systems. Existing evaluation methods often fail to produce objective results and instead exhibit a bias towards favoring high similarity to the ground-truth question-answer pairs. In this study, we demonstrate that these evaluation methods yield low human alignment and propose an alternative approach called Generative Interpretation (GI) to achieve more objective evaluations. Through experimental analysis, we reveal that GI outperforms existing evaluation methods in terms of human alignment, and even shows comparable performance with GPT3.5, only with BART-large.

2023

pdf bib
A Framework for Vision-Language Warm-up Tasks in Multimodal Dialogue Models
Jaewook Lee | Seongsik Park | Seong-Heum Park | Hongjin Kim | Harksoo Kim
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Most research on multimodal open-domain dialogue agents has focused on pretraining and multi-task learning using additional rich datasets beyond a given target dataset. However, methods for exploiting these additional datasets can be quite limited in real-world settings, creating a need for more efficient methods for constructing agents based solely on the target dataset. To address these issues, we present a new learning strategy called vision-language warm-up tasks for multimodal dialogue models (VLAW-MDM). This strategy does not require the use of large pretraining or multi-task datasets but rather relies solely on learning from target data. Moreover, our proposed approach automatically generate captions for images and incorporate them into the model’s input to improve the contextualization of visual information. Using this novel approach, we empirically demonstrate that our learning strategy is effective for limited data and relatively small models. The result show that our method achieved comparable and in some cases superior performance compared to existing state-of-the-art models on various evaluation metrics.

pdf bib
CHEF in the Language Kitchen: A Generative Data Augmentation Leveraging Korean Morpheme Ingredients
Jaehyung Seo | Hyeonseok Moon | Jaewook Lee | Sugyeong Eo | Chanjun Park | Heuiseok Lim
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Korean morphological variations present unique opportunities and challenges in natural language processing (NLP), necessitating an advanced understanding of morpheme-based sentence construction. The complexity of morphological variations allows for diverse sentence forms based on the syntactic-semantic integration of functional morphemes (i.e., affixes) to lexical morphemes (i.e., roots). With this in mind, we propose a method - CHEF, replicating the morphological transformations inherent in sentences based on lexical and functional morpheme combinations through generative data augmentation. CHEF operates using a morpheme blender and a label discriminator, thereby enhancing the diversity of Korean sentence forms by capturing the properties of agglutination while maintaining label consistency. We conduct experiments on Korean multiple classification datasets, improving model performance in full- and few-shot settings. Our proposed method boosts performance beyond the preceding data augmentation methods without incurring external data usage. We demonstrate that our approach achieves comparable results yielded by augmentation techniques that use large language models (LLMs).