2012
pdf
bib
abs
Two stage Machine Translation System using Pattern-based MT and Phrase-based SMT
Jin’ichi Murakami
|
Takuya Nishimura
|
Masoto Tokuhisa
Workshop on Monolingual Machine Translation
We have developed a two-stage machine translation (MT) system. The first stage consists of an automatically created pattern-based machine translation system (PBMT), and the second stage consists of a standard phrase-based statistical machine translation (SMT) system. We studied for the Japanese-English simple sentence task. First, we obtained English sentences from Japanese sentences using an automatically created Japanese-English pattern-based machine translation. We call the English sentences obtained in this way as “English”. Second, we applied a standard SMT (Moses) to the results. This means that we translated the “English” sentences into English by SMT. We also conducted ABX tests (Clark, 1982) to compare the outputs by the standard SMT (Moses) with those by the proposed system for 100 sentences. The experimental results indicated that 30 sentences output by the proposed system were evaluated as being better than those outputs by the standard SMT system, whereas 9 sentences output by the standard SMT system were thought to be better than those outputs by the proposed system. This means that our proposed system functioned effectively in the Japanese-English simple sentence task.
2010
pdf
bib
Statistical pattern-based MT with statistical French-English MT
Jin’ichi Murakami
|
Takuya Nishimura
|
Masao Tokuhisa
Proceedings of the 7th International Workshop on Spoken Language Translation: Evaluation Campaign
2009
pdf
bib
abs
Statistical machine translation adding pattern-based machine translation in Chinese-English translation
Jin’ichi Murakami
|
Masato Tokuhisa
|
Satoru Ikehara
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign
We have developed a two-stage machine translation (MT) system. The first stage is a rule-based machine translation system. The second stage is a normal statistical machine translation system. For Chinese-English machine translation, first, we used a Chinese-English rule-based MT, and we obtained ”ENGLISH” sentences from Chinese sentences. Second, we used a standard statistical machine translation. This means that we translated ”ENGLISH” to English machine translation. We believe this method has two advantages. One is that there are fewer unknown words. The other is that it produces structured or grammatically correct sentences. From the results of experiments, we obtained a BLEU score of 0.3151 in the BTEC-CE task using our proposed method. In contrast, we obtained a BLEU score of 0.3311 in the BTEC-CE task using a standard method (moses). This means that our proposed method was not as effective for the BTEC-CE task. Therefore, we will try to improve the performance by optimizing parameters.
2008
pdf
bib
Non-Compositional Language Model and Pattern Dictionary Development for Japanese Compound and Complex Sentences
Satoru Ikehara
|
Masato Tokuhisa
|
Jin’ichi Murakami
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)
pdf
bib
abs
Statistical machine translation without long parallel sentences for training data.
Jin’ichi Murakami
|
Masato Tokuhisa
|
Satoru Ikehara
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign
In this study, we paid attention to the reliability of phrase table. We have been used the phrase table using Och’s method[2]. And this method sometimes generate completely wrong phrase tables. We found that such phrase table caused by long parallel sentences. Therefore, we removed these long parallel sentences from training data. Also, we utilized general tools for statistical machine translation, such as ”Giza++”[3], ”moses”[4], and ”training-phrase-model.perl”[5]. We obtained a BLEU score of 0.4047 (TEXT) and 0.3553(1-BEST) of the Challenge-EC task for our proposed method. On the other hand, we obtained a BLEU score of 0.3975(TEXT) and 0.3482(1-BEST) of the Challenge-EC task for a standard method. This means that our proposed method was effective for the Challenge-EC task. However, it was not effective for the BTECT-CE and Challenge-CE tasks. And our system was not good performance. For example, our system was the 7th place among 8 system for Challenge-EC task.
2007
pdf
bib
abs
Statistical machine translation using large J/E parallel corpus and long phrase tables
Jin’ichi Murakami
|
Masato Tokuhisa
|
Satoru Ikehara
Proceedings of the Fourth International Workshop on Spoken Language Translation
Our statistical machine translation system that uses large Japanese-English parallel sentences and long phrase tables is described. We collected 698,973 Japanese-English parallel sentences, and we used long phrase tables. Also, we utilized general tools for statistical machine translation, such as ”Giza++”[1], ”moses”[2], and ”training-phrasemodel.perl”[3]. We used these data and these tools, We challenge the contest for IWSLT07. In which task was the result (0.4321 BLEU) obtained.
1999
pdf
bib
Automatic generation of semantic dependency rules for Japanese noun phrases with particles “no”
Satoru Ikehara
|
Shinnji Nakai
|
Jin’ichi Murakami
Proceedings of the 8th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages