Sunghyun Park


2021

pdf bib
Dialogue Response Generation via Contrastive Latent Representation Learning
Shuyang Dai | Guoyin Wang | Sunghyun Park | Sungjin Lee
Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI

Large-scale auto-regressive models have achieved great success in dialogue response generation, with the help of Transformer layers. However, these models do not learn a representative latent space of the sentence distribution, making it hard to control the generation. Recent works have tried on learning sentence representations using Transformer-based framework, but do not model the context-response relationship embedded in the dialogue datasets. In this work, we aim to construct a robust sentence representation learning model, that is specifically designed for dialogue response generation, with Transformer-based encoder-decoder structure. An utterance-level contrastive learning is proposed, encoding predictive information in each context representation for its corresponding response. Extensive experiments are conducted to verify the robustness of the proposed representation learning mechanism. By using both reference-based and reference-free evaluation metrics, we provide detailed analysis on the generated sentences, demonstrating the effectiveness of our proposed model.

pdf bib
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
Boseop Kim | HyoungSeok Kim | Sang-Woo Lee | Gichang Lee | Donghyun Kwak | Jeon Dong Hyeon | Sunghyun Park | Sungju Kim | Seonhoon Kim | Dongpil Seo | Heungsub Lee | Minyoung Jeong | Sungjae Lee | Minsub Kim | Suk Hyun Ko | Seokhun Kim | Taeyong Park | Jinuk Kim | Soyoung Kang | Na-Hyeon Ryu | Kang Min Yoo | Minsuk Chang | Soobin Suh | Sookyo In | Jinseong Park | Kyungduk Kim | Hiun Kim | Jisu Jeong | Yong Goo Yeo | Donghoon Ham | Dongju Park | Min Young Lee | Jaewook Kang | Inho Kang | Jung-Woo Ha | Woomyoung Park | Nako Sung
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs) trained on hundreds of billion scale data. Here we address some remaining issues less reported by the GPT-3 paper, such as a non-English LM, the performances of different sized models, and the effect of recently introduced prompt optimization on in-context learning. To achieve this, we introduce HyperCLOVA, a Korean variant of 82B GPT-3 trained on a Korean-centric corpus of 560B tokens. Enhanced by our Korean-specific tokenization, HyperCLOVA with our training configuration shows state-of-the-art in-context zero-shot and few-shot learning performances on various downstream tasks in Korean. Also, we show the performance benefits of prompt-based learning and demonstrate how it can be integrated into the prompt engineering pipeline. Then we discuss the possibility of materializing the No Code AI paradigm by providing AI prototyping capabilities to non-experts of ML by introducing HyperCLOVA studio, an interactive prompt engineering interface. Lastly, we demonstrate the potential of our methods with three successful in-house applications.

pdf bib
A Scalable Framework for Learning From Implicit User Feedback to Improve Natural Language Understanding in Large-Scale Conversational AI Systems
Sunghyun Park | Han Li | Ameen Patel | Sidharth Mudgal | Sungjin Lee | Young-Bum Kim | Spyros Matsoukas | Ruhi Sarikaya
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Natural Language Understanding (NLU) is an established component within a conversational AI or digital assistant system, and it is responsible for producing semantic understanding of a user request. We propose a scalable and automatic approach for improving NLU in a large-scale conversational AI system by leveraging implicit user feedback, with an insight that user interaction data and dialog context have rich information embedded from which user satisfaction and intention can be inferred. In particular, we propose a domain-agnostic framework for curating new supervision data for improving NLU from live production traffic. With an extensive set of experiments, we show the results of applying the framework and improving NLU for a large-scale production system across 10 domains.

pdf bib
Learning Slice-Aware Representations with Mixture of Attentions
Cheng Wang | Sungjin Lee | Sunghyun Park | Han Li | Young-Bum Kim | Ruhi Sarikaya
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2019

pdf bib
Learning with Limited Data for Multilingual Reading Comprehension
Kyungjae Lee | Sunghyun Park | Hojae Han | Jinyoung Yeo | Seung-won Hwang | Juho Lee
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

This paper studies the problem of supporting question answering in a new language with limited training resources. As an extreme scenario, when no such resource exists, one can (1) transfer labels from another language, and (2) generate labels from unlabeled data, using translator and automatic labeling function respectively. However, these approaches inevitably introduce noises to the training data, due to translation or generation errors, which require a judicious use of data with varying confidence. To address this challenge, we propose a weakly-supervised framework that quantifies such noises from automatically generated labels, to deemphasize or fix noisy data in training. On reading comprehension task, we demonstrate the effectiveness of our model on low-resource languages with varying similarity to English, namely, Korean and French.

2018

pdf bib
Semi-supervised Training Data Generation for Multilingual Question Answering
Kyungjae Lee | Kyoungho Yoon | Sunghyun Park | Seung-won Hwang
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2014

pdf bib
Verbal Behaviors and Persuasiveness in Online Multimedia Content
Moitreya Chatterjee | Sunghyun Park | Han Suk Shim | Kenji Sagae | Louis-Philippe Morency
Proceedings of the Second Workshop on Natural Language Processing for Social Media (SocialNLP)