Terumasa Ehara


2018

pdf bib
SMT reranked NMT (2)
Terumasa Ehara
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation: 5th Workshop on Asian Translation

2017

pdf bib
SMT reranked NMT
Terumasa Ehara
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

System architecture, experimental settings and experimental results of the EHR team for the WAT2017 tasks are described. We participate in three tasks: JPCen-ja, JPCzh-ja and JPCko-ja. Although the basic architecture of our system is NMT, reranking technique is conducted using SMT results. One of the major drawback of NMT is under-translation and over-translation. On the other hand, SMT infrequently makes such translations. So, using reranking of n-best NMT outputs by the SMT output, discarding such translations can be expected. We can improve BLEU score from 46.03 to 47.08 by this technique in JPCzh-ja task.

2016

pdf bib
Translation systems and experimental results of the EHR group for WAT2016 tasks
Terumasa Ehara
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

System architecture, experimental settings and experimental results of the group for the WAT2016 tasks are described. We participate in six tasks: en-ja, zh-ja, JPCzh-ja, JPCko-ja, HINDENen-hi and HINDENhi-ja. Although the basic architecture of our sys-tems is PBSMT with reordering, several techniques are conducted. Especially, the system for the HINDENhi-ja task with pivoting by English uses the reordering technique. Be-cause Hindi and Japanese are both OV type languages and English is a VO type language, we can use reordering technique to the pivot language. We can improve BLEU score from 7.47 to 7.66 by the reordering technique for the sentence level pivoting of this task.

pdf bib
Translation Using JAPIO Patent Corpora: JAPIO at WAT2016
Satoshi Kinoshita | Tadaaki Oshio | Tomoharu Mitsuhashi | Terumasa Ehara
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

We participate in scientific paper subtask (ASPEC-EJ/CJ) and patent subtask (JPC-EJ/CJ/KJ) with phrase-based SMT systems which are trained with its own patent corpora. Using larger corpora than those prepared by the workshop organizer, we achieved higher BLEU scores than most participants in EJ and CJ translations of patent subtask, but in crowdsourcing evaluation, our EJ translation, which is best in all automatic evaluations, received a very poor score. In scientific paper subtask, our translations are given lower scores than most translations that are produced by translation engines trained with the in-domain corpora. But our scores are higher than those of general-purpose RBMTs and online services. Considering the result of crowdsourcing evaluation, it shows a possibility that CJ SMT system trained with a large patent corpus translates non-patent technical documents at a practical level.

2015

pdf bib
System Combination of RBMT plus SPE and Preordering plus SMT
Terumasa Ehara
Proceedings of the 2nd Workshop on Asian Translation (WAT2015)

2014

pdf bib
A machine translation system combining rule-based machine translation and statistical post-editing
Terumasa Ehara
Proceedings of the 1st Workshop on Asian Translation (WAT2014)

2009

pdf bib
Meta-evaluation of Automatic Evaluation Methods for Machine using Patent Translation Data in NTCIR-7
Hiroshi Echizen-ya | Terumasa Ehara | Sayori Shimohata | Atsushi Fujii | Masao Utiyama | Mikio Yamamoto | Takehito Utsuro | Noriko Kando
Proceedings of the Third Workshop on Patent Translation

2004

pdf bib
Back Transliteration from Japanese to English using Target English Context
Isao Goto | Naoto Kato | Terumasa Ehara | Hideki Tanaka
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf bib
Transliteration considering context information based on the maximum entropy method
Isao Goto | Naoto Kato | Noriyoshi Uratani | Terumasa Ehara
Proceedings of Machine Translation Summit IX: Papers

This paper proposes a method of automatic transliteration from English to Japanese words. Our method successfully transliterates an English word not registered in any bilingual or pronunciation dictionaries by converting each partial letters in the English word into Japanese katakana characters. In such transliteration, identical letters occurring in different English words must often be converted into different katakana. To produce an adequate transliteration, the proposed method considers chunking of alphabetic letters of an English word into conversion units and considers English and Japanese context information simultaneously to calculate the plausibility of conversion. We have confirmed experimentally that the proposed method improves the conversion accuracy by 63% compared to a simple method that ignores the plausibility of chunking and contextual information.

pdf bib
A multi-language translation example browser
Isao Goto | Naoto Kato | Noriyoshi Uratani | Terumasa Ehara | Tadashi Kumano | Hideki Tanaka
Proceedings of Machine Translation Summit IX: System Presentations

This paper describes a Multi-language Translation Example Browser, a type of translation memory system. The system is able to retrieve translation examples from bilingual news databases, which consist of news transcripts of past broadcasts. We put a Japanese-English system to practical use and undertook trial operations of a system of eight language-pairs.

2001

pdf bib
An automatic evaluation method for machine translation using two-way MT
Shoichi Yokoyama | Hideki Kashioka | Akira Kumano | Masaki Matsudaira | Yoshiko Shirokizawa | Shuji Kodama | Terumasa Ehara | Shinichiro Miyazawa | Yuzo Murata
Proceedings of Machine Translation Summit VIII

Evaluation of machine translation is one of the most important issues in this field. We have already proposed a quantitative evaluation of machine translation system. The method was roughly that an example sentence in Japanese is machine translated into English, and then into Japanese using several systems, and that the comparison of output Japanese sentences with the original Japanese sentence is done for the word identification, the correctness of the modification, the syntactic dependency, and the parataxis. By calculating the score, we could quantitatively evaluate the English machine translation. However, the extraction of word identification etc. was done by human, and the fact affects the correctness of evaluation. In order to solve this problem, we developed an automatic evaluation system. We report the detail of the system in this paper..

1999

pdf bib
Quantitative evaluation of machine translation using two-way MT
Shoichi Yokoyama | Akira Kumano | Masaki Matsudaira | Yoshiko Shirokizawa | Mutsumi Kawagoe | Shuji Kodama | Hideki Kashioka | Terumasa Ehara | Shinichiro Miyazawa | Yasuo Nakajima
Proceedings of Machine Translation Summit VII

One of the most important issues in the field of machine translation is evaluation of the translated sentences. This paper proposes a quantitative method of evaluation for machine translation systems. The method is as follows. First, an example sentence in Japanese is machine translated into English using several Japanese-English machine translation systems. Second, the output English sentences are machine translated into Japanese using several English-Japanese machine translation systems (different from the Japanese-English machine translation systems). Then, each output Japanese sentence is compared with the original Japanese sentence in terms of word identification, correctness of the modification, syntactic dependency, and parataxes. An average score is calculated, and this becomes the total evaluation of the machine translation of the sentence. From this two-way machine translation and the calculation of the score, we can quantitatively evaluate the English machine translation. For the present study, we selected 100 Japanese sentences from the abstracts of scientific articles. Each of these sentences has an English translation which was performed by a human. Approximately half of these sentences are evaluated and the results are given. In addition, a comparison of human and machine translations is also performed and the trade-off between the two methods of translation is discussed.

1998

pdf bib
Project for production of closed-caption TV programs for the hearing impaired
Takahiro Wakao | Eiji Sawamura | Terumasa Ehara | Ichiro Maruyama
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

1997

pdf bib
Application of NLP technologyto production of closed-caption TV. programs in Japanese for the hearing impaired
Takahiro Wakao | Terumasa Ehara | Eiji Sawamura | Yoshiharu Abe | Katsuhiko Shirai
Natural Language Processing for Communication Aids

1991

pdf bib
Processing Unknown Words in Continuous Speech Recognition
Kenji Kita | Terumasa Ehara | Tsuyoshi Morimoto
Proceedings of the Second International Workshop on Parsing Technologies

Current continuous speech recognition systems essentially ignore unknown words. Systems are designed to recognize words in the lexicon. However, for using speech recognition systems in real applications of spoken-language processing, it is very important to process unknown words. This paper proposes a continuous speech recognition method which accepts any utterance that might include unknown words. In this method, words not in the lexicon are transcribed as phone sequences, while words in the lexicon are recognized correctly. The HMM-LR speech recognition system, which is an integration of Hidden Markov Models and generalized LR parsing, is used as the baseline system, and enhanced with the trigram model of syllables to take into account the stochastic characteristics of a language. Preliminary results indicate that our approach is very promising.

1990

pdf bib
A Machine Translation System for Foreign News in Satellite Broadcasting
Teruaki Aizawa | Terumasa Ehara | Noriyoshi Uratani | Hideki Tanaka | Naoto Kato | Sumio Nakase | Norikazu Aruga | Takeo Matsuda
COLING 1990 Volume 3: Papers presented to the 13th International Conference on Computational Linguistics