Min Young Lee
2026
Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning
Sanghwan Bae | Jiwoo Hong | Min Young Lee | Hanbyul Kim | Jeongyeon Nam | Donghyun Kwak
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Sanghwan Bae | Jiwoo Hong | Min Young Lee | Hanbyul Kim | Jeongyeon Nam | Donghyun Kwak
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Recent advances in reinforcement learning with verifiable rewards (RLVR) show that large language models enhance their reasoning abilities when trained with verifiable signals. However, due to reward sparsity, effectiveness depends heavily on selecting samples of appropriate difficulty. In this work, we present a formal analysis of online difficulty-aware filtering and establish its theoretical foundations. We show that expected policy improvement is lower-bounded by the variance of task-level success probabilities, implying that selecting tasks of intermediate difficulty maximizes learning efficiency. Building on this, we demonstrate that balanced filtering maximizes this lower bound, leading to superior performance and sample efficiency. Evaluations across multiple math reasoning benchmarks validate that balanced filtering consistently enhances convergence speed and final performance, achieving up to +12% gains in less than half the training steps of standard GRPO. By extending our analysis to various reward distributions, we provide a principled foundation for future RLVR curriculum strategies, confirmed through both theoretical analysis and extensive empirical results.
2022
Keep Me Updated! Memory Management in Long-term Conversations
Sanghwan Bae | Donghyun Kwak | Soyoung Kang | Min Young Lee | Sungdong Kim | Yuin Jeong | Hyeri Kim | Sang-Woo Lee | Woomyoung Park | Nako Sung
Findings of the Association for Computational Linguistics: EMNLP 2022
Sanghwan Bae | Donghyun Kwak | Soyoung Kang | Min Young Lee | Sungdong Kim | Yuin Jeong | Hyeri Kim | Sang-Woo Lee | Woomyoung Park | Nako Sung
Findings of the Association for Computational Linguistics: EMNLP 2022
Remembering important information from the past and continuing to talk about it in the present are crucial in long-term conversations. However, previous literature does not deal with cases where the memorized information is outdated, which may cause confusion in later conversations. To address this issue, we present a novel task and a corresponding dataset of memory management in long-term conversations, in which bots keep track of and bring up the latest information about users while conversing through multiple sessions. In order to support more precise and interpretable memory, we represent memory as unstructured text descriptions of key information and propose a new mechanism of memory management that selectively eliminates invalidated or redundant information. Experimental results show that our approach outperforms the baselines that leave the stored memory unchanged in terms of engagingness and humanness, with larger performance gap especially in the later sessions.
2021
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
Boseop Kim | HyoungSeok Kim | Sang-Woo Lee | Gichang Lee | Donghyun Kwak | Jeon Dong Hyeon | Sunghyun Park | Sungju Kim | Seonhoon Kim | Dongpil Seo | Heungsub Lee | Minyoung Jeong | Sungjae Lee | Minsub Kim | Suk Hyun Ko | Seokhun Kim | Taeyong Park | Jinuk Kim | Soyoung Kang | Na-Hyeon Ryu | Kang Min Yoo | Minsuk Chang | Soobin Suh | Sookyo In | Jinseong Park | Kyungduk Kim | Hiun Kim | Jisu Jeong | Yong Goo Yeo | Donghoon Ham | Dongju Park | Min Young Lee | Jaewook Kang | Inho Kang | Jung-Woo Ha | Woomyoung Park | Nako Sung
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Boseop Kim | HyoungSeok Kim | Sang-Woo Lee | Gichang Lee | Donghyun Kwak | Jeon Dong Hyeon | Sunghyun Park | Sungju Kim | Seonhoon Kim | Dongpil Seo | Heungsub Lee | Minyoung Jeong | Sungjae Lee | Minsub Kim | Suk Hyun Ko | Seokhun Kim | Taeyong Park | Jinuk Kim | Soyoung Kang | Na-Hyeon Ryu | Kang Min Yoo | Minsuk Chang | Soobin Suh | Sookyo In | Jinseong Park | Kyungduk Kim | Hiun Kim | Jisu Jeong | Yong Goo Yeo | Donghoon Ham | Dongju Park | Min Young Lee | Jaewook Kang | Inho Kang | Jung-Woo Ha | Woomyoung Park | Nako Sung
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs) trained on hundreds of billion scale data. Here we address some remaining issues less reported by the GPT-3 paper, such as a non-English LM, the performances of different sized models, and the effect of recently introduced prompt optimization on in-context learning. To achieve this, we introduce HyperCLOVA, a Korean variant of 82B GPT-3 trained on a Korean-centric corpus of 560B tokens. Enhanced by our Korean-specific tokenization, HyperCLOVA with our training configuration shows state-of-the-art in-context zero-shot and few-shot learning performances on various downstream tasks in Korean. Also, we show the performance benefits of prompt-based learning and demonstrate how it can be integrated into the prompt engineering pipeline. Then we discuss the possibility of materializing the No Code AI paradigm by providing AI prototyping capabilities to non-experts of ML by introducing HyperCLOVA studio, an interactive prompt engineering interface. Lastly, we demonstrate the potential of our methods with three successful in-house applications.
Search
Fix author
Co-authors
- Donghyun Kwak 3
- Sanghwan Bae 2
- Soyoung Kang 2
- Sang-Woo Lee 2
- Woomyoung Park 2
- Nako Sung 2
- Minsuk Chang 1
- Jeon Dong Hyeon 1
- Jung-Woo Ha 1
- Donghoon Ham 1
- Jiwoo Hong 1
- Sookyo In 1
- Minyoung Jeong 1
- Jisu Jeong 1
- Yuin Jeong 1
- Jaewook Kang 1
- Inho Kang 1
- Boseop Kim 1
- HyoungSeok Kim 1
- Sungju Kim 1
- Seonhoon Kim 1
- Minsub Kim 1
- Seokhun Kim 1
- Jinuk Kim 1
- Kyungduk Kim 1
- Hiun Kim 1
- Sungdong Kim 1
- Hyeri Kim 1
- Hanbyul Kim 1
- Suk Hyun Ko 1
- Gichang Lee 1
- Heungsub Lee 1
- Sungjae Lee 1
- Jeongyeon Nam 1
- Sunghyun Park 1
- Taeyong Park 1
- Jinseong Park 1
- Dongju Park 1
- Na-Hyeon Ryu 1
- Dongpil Seo 1
- Soobin Suh 1
- Yong Goo Yeo 1
- Kang Min Yoo 1