Paraphrasing Depending on Bilingual Context Toward Generalization of Translation Knowledge
Young-Sook Hwang | YoungKil Kim | Sangkyu Park
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I


Customizing a Korean-English MT System for Patent Translation
Munpyo Hong | Young-Gil Kim | Chang-Hyun Kim | Seong-Il Yang | Young-Ae Seo | Cheol Ryu | Sang-Kyu Park
Proceedings of Machine Translation Summit X: Papers

This paper addresses a customization process of a Korean-English MT system for patent translation. The major customization steps include terminology construction, linguistic study, and the modification of the existing analysis and generation-module. T o our knowledge, this is the first worth-mentioning large-scale customization effort of an MT system for Korean and English. This research was performed under the auspices of the MIC (Ministry of Information and Communication) of Korean government. A prototype patent MT system for electronics domain was installed and is being tested in the Korean Intellectual Property Office.

Terminology Construction Workflow for Korean-English Patent MT
Young-Gil Kim | Seong-Il Yang | Munpyo Hong | Chang-Hyun Kim | Young-Ae Seo | Cheol Ryu | Sang-Kyu Park | Se-Young Park
Workshop on patent translation

This paper addresses the workflow for terminology construction for Korean-English patent MT system. The workflow consists of the stage for setting lexical goals and the semi- automatic terminology construction stage. As there is no comparable system, it is difficult to determine how many terms are needed. To estimate the number of the needed terms, we analyzed 45,000 patent documents. Given the limited time and budget, we resorted to the semi-automatic methods to create the bilingual term dictionary in electronics domain. We will show that parenthesis information in Korean patent documents and bilingual title corpus can be successfully used to build a bilingual term dictionary.


Semi-Automatic Construction of Korean-Chinese Verb Patterns Based on Translation Equivalency
Munpyo Hong | Young-Kil Kim | Sang-Kyu Park | Young-Jik Lee
Proceedings of the Workshop on Multilingual Linguistic Resources


For the proper treatment of long sentences in a sentence pattern-based English-Korean MT system
Yoon-Hyung Roh | Munpyo Hong | Sung-Kwon Choi | Ki-Young Lee | Sang-Kyu Park
Proceedings of Machine Translation Summit IX: Papers

This paper describes a sentence pattern-based English-Korean machine translation system backed up by a rule-based module as a solution to the translation of long sentences. A rule-based English-Korean MT system typically suffers from low translation accuracy for long sentences due to poor parsing performance. In the proposed method we only use chunking information on the phrase-level of the parse result (i.e. NP, PP, and AP). By applying a sentence pattern directly to a chunking result, the high performance of analysis and a good quality of translation are expected. The parsing efficiency problem in the traditional RBMT approach is resolved by sentence partitioning, which is generally assumed to have many problems. However, we will show that the sentence partitioning has little side effect, if any, in our approach, because we use only the chunking results for the transfer. The coverage problem of a pattern-based method is overcome by applying sentence pattern matching recursively to the sub-sentences of the input sentence, in case there is no exact matching pattern to the input sentence.


CaptionEye/EK: a English-to-Korean caption translation system using the sentence pattern
Young-Ae Seo | Yoon-Hyung Roh | Ki-Young Lee | Sang-Kyu Park
Proceedings of Machine Translation Summit VIII


English-to-Korean Web translator : “FromTo/Web-EK
Sung-Kwon Choi | Taewan Kim | Sanghwa Yuh | Han-Min Jung | Chul-Min Sim | Sang-Kyu Park
Proceedings of Machine Translation Summit VII

The previous English-Korean MT system that have been developed in Korea have dealt with only written text as translation object. Most of them enumerated a following list of the problems that had not seemed to be easy to solve in the near future : 1) processing of non-continuous idiomatic expressions 2) reduction of too many POS or structural ambiguities 3) robust processing for long sentence and parsing failure 4) selecting correct word correspondence between several alternatives. The problems can be considered as important factors that have influence on the translation quality of machine translation system. This paper describes not only the solutions of problems of the previous English-to-Korean machine translation systems but also the HTML tags management between two structurally different languages, English and Korean. Through the solutions we translate successfully English web documents into Korean one in the English-to-Korean web translator "FromTo/Web-EK" which has been developed from 1997.

From To K/E: a Korean-English machine translation system based on idiom recognition and fail softening
Byong-Rae Ryu | Youngkil Kim | Sanghwa Yuh | Sangkyu Park
Proceedings of Machine Translation Summit VII

In this paper we describe and experimentally evaluate FromTo K/E, a rule-based Korean-English machine translation system adapting transfer methodology. In accordance with the view that a successful Korean-English machine translation system presumes a highly efficient robust Korean parser, we develop a parser reinforced with "Fail Softening", i.e. the long sentence segmentation and the recovery of failed parse trees. To overcome the language-typological differences between Korean and English, we adopt a powerful module for processing Korean multi-word lexemes and Korean idiomatic expressions. Prior to parsing Korean sentences, furthermore, we try to resolve the ambiguity of words with unknown grammatical functions on the basis of the collocation and subcategorization information. The results of the experimental evaluation show that the degree of understandability for sample 2000 sentences amounts to 2.67, indicating that the meaning of the translated English sentences is almost clear to users, but the sentences still include minor grammatical or stylistic errors up to max. 30% of the whole words.