Youngwon Lee


2023

pdf bib
Learning to Rank Generation with Pairwise Partial Rewards
Youngwon Lee | Jinu Lee | Seung-won Hwang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

This paper studies the use of reinforcement learning for conditional text generation, which overcomes the limitation of the prevalent supervised maximum likelihood estimation approach. However, it still suffers from challenges including the large action space and the delayed reward, as the reward can be computed only after an entire sequence is generated. To address these challenges, we propose a method that provides partial rewards for intermediate actions taken on partial sequences. This enables the model to promptly prioritize actions that lead to the generation of more desirable sequences. Our method’s key contribution lies in its focus on distinguishing relatively more desirable actions rather than striving to precisely estimate pointwise values for arbitrary partial sequences. Instead, our model learns to discern the relative desirability between pairs of actions, or rank actions in a pairwise manner, only when necessary and feasible. This is materialized in an efficient way by leveraging the prefix tree constructed from the sampled sequences. Experimental results on paraphrase generation and constrained machine translation tasks showcase the effectiveness of our method.

pdf bib
On Sample-Efficient Code Generation
Hojae Han | Yu Jin Kim | Byoungjip Kim | Youngwon Lee | Kyungjae Lee | Kyungmin Lee | Moontae Lee | Kyunghoon Bae | Seung-won Hwang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track

Large language models often struggle to predict runtime behavior in code generation tasks, leading to a reliance on rejection sampling (best-of-n) to generate multiple code snippets then select the best. Our distinction is reducing sampling costs, without compromising generation quality. We introduce EFFICODE, a novel framework that prioritizes sampling on test problems that models can solve. We show how EFFICODE estimates solvability to optimize computational costs during multiple sampling. Based on empirical evidence, EFFICODE consistently demonstrates reduced sampling budgets while maintaining comparable code generation performance, especially when problems are challenging. In addition, utilizing EFFICODE to rank sampled code snippets also shows its effectiveness in answer code selection for reducing temporal costs, by not requiring any execution or test case generation.

pdf bib
On Consistency Training for Language-Based Image Editing Interface
Youngwon Lee | Ayoung Lee | Yeonjoon Jung | Seung-won Hwang
Proceedings of the Second Workshop on Natural Language Interfaces

2022

pdf bib
Normalizing Mutual Information for Robust Adaptive Training for Translation
Youngwon Lee | Changmin Lee | Hojin Lee | Seung-won Hwang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Despite the success of neural machine translation models, tensions between fluency of optimizing target language modeling and source-faithfulness remain as challenges. Previously, Conditional Bilingual Mutual Information (CBMI), a scoring metric for the importance of target sentences and tokens, was proposed to encourage fluent and faithful translations. The score is obtained by combining the probability from the translation model and the target language model, which is then used to assign different weights to losses from sentences and tokens. Meanwhile, we argue this metric is not properly normalized, for which we propose Normalized Pointwise Mutual Information (NPMI). NPMI utilizes an additional language model on source language to approximate the joint likelihood of source-target pair and the likelihood of the source, which is then used for normalizing the score. We showed that NPMI better captures the dependence between source-target and that NPMI-based token-level adaptive training brings improvements over baselines with empirical results from En-De, De-En, and En-Ro translation tasks.