Young-Gil Kim

Also published as: Young Kil Kim, Young-Kil Kim, Young-Kill Kim, Young-gil Kim, YoungKil Kim


2020

pdf bib
POSTECH-ETRI’s Submission to the WMT2020 APE Shared Task: Automatic Post-Editing with Cross-lingual Language Model
Jihyung Lee | WonKee Lee | Jaehun Shin | Baikjin Jung | Young-Kil Kim | Jong-Hyeok Lee
Proceedings of the Fifth Conference on Machine Translation

This paper describes POSTECH-ETRI’s submission to WMT2020 for the shared task on automatic post-editing (APE) for 2 language pairs: English-German (En-De) and English-Chinese (En-Zh). We propose APE systems based on a cross-lingual language model, which jointly adopts translation language modeling (TLM) and masked language modeling (MLM) training objectives in the pre-training stage; the APE models then utilize jointly learned language representations between the source language and the target language. In addition, we created 19 million new sythetic triplets as additional training data for our final ensemble model. According to experimental results on the WMT2020 APE development data set, our models showed an improvement over the baseline by TER of -3.58 and a BLEU score of +5.3 for the En-De subtask; and TER of -5.29 and a BLEU score of +7.32 for the En-Zh subtask.

2019

pdf bib
Data Augmentation by Data Noising for Open-vocabulary Slots in Spoken Language Understanding
Hwa-Yeon Kim | Yoon-Hyung Roh | Young-Kil Kim
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

One of the main challenges in Spoken Language Understanding (SLU) is dealing with ‘open-vocabulary’ slots. Recently, SLU models based on neural network were proposed, but it is still difficult to recognize the slots of unknown words or ‘open-vocabulary’ slots because of the high cost of creating a manually tagged SLU dataset. This paper proposes data noising, which reflects the characteristics of the ‘open-vocabulary’ slots, for data augmentation. We applied it to an attention based bi-directional recurrent neural network (Liu and Lane, 2016) and experimented with three datasets: Airline Travel Information System (ATIS), Snips, and MIT-Restaurant. We achieved performance improvements of up to 0.57% and 3.25 in intent prediction (accuracy) and slot filling (f1-score), respectively. Our method is advantageous because it does not require additional memory and it can be applied simultaneously with the training process of the model.

pdf bib
JBNU at MRP 2019: Multi-level Biaffine Attention for Semantic Dependency Parsing
Seung-Hoon Na | Jinwoon Min | Kwanghyeon Park | Jong-Hun Shin | Young-Kil Kim
Proceedings of the Shared Task on Cross-Framework Meaning Representation Parsing at the 2019 Conference on Natural Language Learning

This paper describes Jeonbuk National University (JBNU)’s system for the 2019 shared task on Cross-Framework Meaning Representation Parsing (MRP 2019) at the Conference on Computational Natural Language Learning. Of the five frameworks, we address only the DELPH-IN MRS Bi-Lexical Dependencies (DP), Prague Semantic Dependencies (PSD), and Universal Conceptual Cognitive Annotation (UCCA) frameworks. We propose a unified parsing model using biaffine attention (Dozat and Manning, 2017), consisting of 1) a BERT-BiLSTM encoder and 2) a biaffine attention decoder. First, the BERT-BiLSTM for sentence encoder uses BERT to compose a sentence’s wordpieces into word-level embeddings and subsequently applies BiLSTM to word-level representations. Second, the biaffine attention decoder determines the scores for an edge’s existence and its labels based on biaffine attention functions between roledependent representations. We also present multi-level biaffine attention models by combining all the role-dependent representations that appear at multiple intermediate layers.

2018

pdf bib
Improving a Multi-Source Neural Machine Translation Model with Corpus Extension for Low-Resource Languages
Gyu-Hyeon Choi | Jong-Hun Shin | Young-Kil Kim
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Two-Step Training and Mixed Encoding-Decoding for Implementing a Generative Chatbot with a Small Dialogue Corpus
Jintae Kim | Hyeon-Gu Lee | Harksoo Kim | Yeonsoo Lee | Young-Gil Kim
Proceedings of the Workshop on Intelligent Interactive Systems and Language Generation (2IS&NLG)

2015

pdf bib
Semi-automatic Filtering of Translation Errors in Triangle Corpus
Sung-Kwon Choi | Jong-Hun Shin | Young-Gil Kim
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation: Posters

2013

pdf bib
Patent translation as technical document translation: customizing a Chinese-Korean MT system to patent domain
Yun Jin | Oh-Woog Kwon | Seung-Hoon Na | Young-Gil Kim
Proceedings of the 5th Workshop on Patent Translation

2012

pdf bib
Applying Statistical Post-Editing to English-to-Korean Rule-based Machine Translation System
Ki-Young Lee | Young-Gil Kim
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation

2011

pdf bib
Improving PP Attachment Disambiguation in a Rule-based Parser
Yoon-Hyung Roh | Ki-Young Lee | Young-Gil Kim
Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation

2009

pdf bib
Incorporating Statistical Information of Lexical Dependency into a Rule-Based Parser
Yoon-Hyung Roh | Ki-Young Lee | Young-Gil Kim
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2

pdf bib
Effective Use of Chinese Structural Auxiliaries for Chinese Parsing
Yun Jin | Qing Li | Yingshun Wu | Young-Gil Kim
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2

pdf bib
Customizing an English-Korean Machine Translation System for Patent/Technical Documents Translation
Oh-Woog Kwon | Sung-Kwon Choi | Ki-Young Lee | Yoon-Hyung Roh | Young-Gil Kim
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2

2008

pdf bib
Paraphrasing Depending on Bilingual Context Toward Generalization of Translation Knowledge
Young-Sook Hwang | YoungKil Kim | Sangkyu Park
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
How to Overcome the Domain Barriers in Pattern-Based Machine Translation System
Sung-Kwon Choi | Ki-Young Lee | Yoon-Hyung Roh | Oh-Woog Kwon | Young-Gil Kim
Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation

pdf bib
What is Needed the Most in MT-Supported Paper Writing
Chang Hyun Kim | Oh-Woog Kwon | Young Kil Kim
Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation

pdf bib
Recognizing Coordinate Structures for Machine Translation of English Patent Documents
Yoon-Hyung Roh | Ki-Young Lee | Sung-Kwon Choi | Oh-Woog Kwon | Young-Gil Kim
Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation

2007

pdf bib
Customizing an English-Korean Machine Translation System for Patent Translation
Sung-Kwon Choi | Young-Gil Kim
Proceedings of the 21st Pacific Asia Conference on Language, Information and Computation

pdf bib
Semi-Automatic Annotation Tool to Build Large Dependency Tree-Tagged Corpus
Eun-Jin Park | Jae-Hoon Kim | Chang-Hyun Kim | Young-Kill Kim
Proceedings of the 21st Pacific Asia Conference on Language, Information and Computation

pdf bib
Getting professional translation through user interaction
Young-Ae Seo | Chang-Hyun Kim | Seong-Il Yang | Young-gil Kim
Proceedings of Machine Translation Summit XI: Papers

pdf bib
English-Korean patent system: fromTo-EK/PAT
Oh-Woog Kwon | Sung-Kwon Choi | Ki-Young Lee | Yoon-Hyung Roh | Young-Gil Kim | Munpyo Hong
Proceedings of the Workshop on Patent translation

2005

pdf bib
Customizing a Korean-English MT System for Patent Translation
Munpyo Hong | Young-Gil Kim | Chang-Hyun Kim | Seong-Il Yang | Young-Ae Seo | Cheol Ryu | Sang-Kyu Park
Proceedings of Machine Translation Summit X: Papers

This paper addresses a customization process of a Korean-English MT system for patent translation. The major customization steps include terminology construction, linguistic study, and the modification of the existing analysis and generation-module. T o our knowledge, this is the first worth-mentioning large-scale customization effort of an MT system for Korean and English. This research was performed under the auspices of the MIC (Ministry of Information and Communication) of Korean government. A prototype patent MT system for electronics domain was installed and is being tested in the Korean Intellectual Property Office.

pdf bib
Terminology Construction Workflow for Korean-English Patent MT
Young-Gil Kim | Seong-Il Yang | Munpyo Hong | Chang-Hyun Kim | Young-Ae Seo | Cheol Ryu | Sang-Kyu Park | Se-Young Park
Workshop on patent translation

This paper addresses the workflow for terminology construction for Korean-English patent MT system. The workflow consists of the stage for setting lexical goals and the semi- automatic terminology construction stage. As there is no comparable system, it is difficult to determine how many terms are needed. To estimate the number of the needed terms, we analyzed 45,000 patent documents. Given the limited time and budget, we resorted to the semi-automatic methods to create the bilingual term dictionary in electronics domain. We will show that parenthesis information in Korean patent documents and bilingual title corpus can be successfully used to build a bilingual term dictionary.

2004

pdf bib
Semi-Automatic Construction of Korean-Chinese Verb Patterns Based on Translation Equivalency
Munpyo Hong | Young-Kil Kim | Sang-Kyu Park | Young-Jik Lee
Proceedings of the Workshop on Multilingual Linguistic Resources

2002

pdf bib
Korean-Chinese machine translation based on verb patterns
Changhyun Kim | Munpyo Hong | Yinxia Huang | Young Kil Kim | Sung Il Yang | Young Ae Seo | Sung-Kwon Choi
Proceedings of the 5th Conference of the Association for Machine Translation in the Americas: Technical Papers

This paper describes our ongoing project “Korean-Chinese Machine Translation System”. The main knowledge of our system is verb patterns. Each verb can have several meanings and each meaning of a verb is represented by a verb pattern. A verb pattern consists of a source language pattern part for the analysis and the corresponding target language pattern part for the generation. Each pattern part, according to the degree of generality, contains lexical or semantic information for the arguments or adjuncts of each verb meaning. In this approach, accurate analysis can directly lead to natural and correct generation. Furthermore as the transfer mainly depends upon verb patterns, the translation rate is expected to go higher, as the size of verb pattern grows larger.

2001

pdf bib
Verb Pattern Based Korean-Chinese Machine Translation System
Changhyun Kim | Young Kil Kim | Munpyo Hong | Young Ae Seo | Sung Il Yang | Sung-Kwon Choi
Proceedings of the 16th Pacific Asia Conference on Language, Information and Computation