2024
pdf
bib
abs
Findings of the WMT 2024 Shared Task on Non-Repetitive Translation
Kazutaka Kinugawa
|
Hideya Mino
|
Isao Goto
|
Naoto Shirai
Proceedings of the Ninth Conference on Machine Translation
The repetition of words in an English sentence can create a monotonous or awkward impression. In such cases, repetition should be avoided appropriately. To evaluate the performance of machine translation (MT) systems in avoiding such repetition and outputting more polished translations, we presented the shared task of controlling the lexical choice of MT systems. From Japanese–English parallel news articles, we collected several hundred sentence pairs in which the source sentences containing repeated words were translated in a style that avoided repetition. Participants were required to encourage the MT system to output tokens in a non-repetitive manner while maintaining translation quality. We conducted human and automatic evaluations of systems submitted by two teams based on an encoder-decoder Transformer and a large language model, respectively. From the experimental results and analysis, we report a series of findings on this task.
pdf
bib
Proceedings of the Eleventh Workshop on Asian Translation (WAT 2024)
Toshiaki Nakazawa
|
Isao Goto
Proceedings of the Eleventh Workshop on Asian Translation (WAT 2024)
2023
pdf
bib
Proceedings of the 10th Workshop on Asian Translation
Toshiaki Nakazawa
|
Kazutaka Kinugawa
|
Hideya Mino
|
Isao Goto
|
Raj Dabre
|
Shohei Higashiyama
|
Shantipriya Parida
|
Makoto Morishita
|
Ondrej Bojar
|
Akiko Eriguchi
|
Yusuke Oda
|
Akiko Eriguchi
|
Chenhui Chu
|
Sadao Kurohashi
Proceedings of the 10th Workshop on Asian Translation
pdf
bib
abs
Overview of the 10th Workshop on Asian Translation
Toshiaki Nakazawa
|
Kazutaka Kinugawa
|
Hideya Mino
|
Isao Goto
|
Raj Dabre
|
Shohei Higashiyama
|
Shantipriya Parida
|
Makoto Morishita
|
Ondřej Bojar
|
Akiko Eriguchi
|
Yusuke Oda
|
Chenhui Chu
|
Sadao Kurohashi
Proceedings of the 10th Workshop on Asian Translation
This paper presents the results of the shared tasks from the 10th workshop on Asian translation (WAT2023). For the WAT2023, 2 teams submitted their translation results for the human evaluation. We also accepted 1 research paper. About 40 translation results were submitted to the automatic evaluation server, and selected submissions were manually evaluated.
2022
pdf
bib
abs
Overview of the 9th Workshop on Asian Translation
Toshiaki Nakazawa
|
Hideya Mino
|
Isao Goto
|
Raj Dabre
|
Shohei Higashiyama
|
Shantipriya Parida
|
Anoop Kunchukuttan
|
Makoto Morishita
|
Ondřej Bojar
|
Chenhui Chu
|
Akiko Eriguchi
|
Kaori Abe
|
Yusuke Oda
|
Sadao Kurohashi
Proceedings of the 9th Workshop on Asian Translation
This paper presents the results of the shared tasks from the 9th workshop on Asian translation (WAT2022). For the WAT2022, 8 teams submitted their translation results for the human evaluation. We also accepted 4 research papers. About 300 translation results were submitted to the automatic evaluation server, and selected submissions were manually evaluated.
2021
pdf
bib
Proceedings of the 8th Workshop on Asian Translation (WAT2021)
Toshiaki Nakazawa
|
Hideki Nakayama
|
Isao Goto
|
Hideya Mino
|
Chenchen Ding
|
Raj Dabre
|
Anoop Kunchukuttan
|
Shohei Higashiyama
|
Hiroshi Manabe
|
Win Pa Pa
|
Shantipriya Parida
|
Ondřej Bojar
|
Chenhui Chu
|
Akiko Eriguchi
|
Kaori Abe
|
Yusuke Oda
|
Katsuhito Sudoh
|
Sadao Kurohashi
|
Pushpak Bhattacharyya
Proceedings of the 8th Workshop on Asian Translation (WAT2021)
pdf
bib
abs
Overview of the 8th Workshop on Asian Translation
Toshiaki Nakazawa
|
Hideki Nakayama
|
Chenchen Ding
|
Raj Dabre
|
Shohei Higashiyama
|
Hideya Mino
|
Isao Goto
|
Win Pa Pa
|
Anoop Kunchukuttan
|
Shantipriya Parida
|
Ondřej Bojar
|
Chenhui Chu
|
Akiko Eriguchi
|
Kaori Abe
|
Yusuke Oda
|
Sadao Kurohashi
Proceedings of the 8th Workshop on Asian Translation (WAT2021)
This paper presents the results of the shared tasks from the 8th workshop on Asian translation (WAT2021). For the WAT2021, 28 teams participated in the shared tasks and 24 teams submitted their translation results for the human evaluation. We also accepted 5 research papers. About 2,100 translation results were submitted to the automatic evaluation server, and selected submissions were manually evaluated.
pdf
bib
abs
NHK’s Lexically-Constrained Neural Machine Translation at WAT 2021
Hideya Mino
|
Kazutaka Kinugawa
|
Hitoshi Ito
|
Isao Goto
|
Ichiro Yamada
|
Takenobu Tokunaga
Proceedings of the 8th Workshop on Asian Translation (WAT2021)
This paper describes the system of our team (NHK) for the WAT 2021 Japanese-English restricted machine translation task. In this task, the aim is to improve quality while maintaining consistent terminology for scientific paper translation. This task has a unique feature, where some words in a target sentence are given in addition to a source sentence. In this paper, we use a lexically-constrained neural machine translation (NMT), which concatenates the source sentence and constrained words with a special token to input them into the encoder of NMT. The key to the successful lexically-constrained NMT is the way to extract constraints from a target sentence of training data. We propose two extraction methods: proper-noun constraint and mistranslated-word constraint. These two methods consider the importance of words and fallibility of NMT, respectively. The evaluation results demonstrate the effectiveness of our lexical-constraint method.
2020
pdf
bib
abs
Effective Use of Target-side Context for Neural Machine Translation
Hideya Mino
|
Hitoshi Ito
|
Isao Goto
|
Ichiro Yamada
|
Takenobu Tokunaga
Proceedings of the 28th International Conference on Computational Linguistics
In this paper, we deal with two problems in Japanese-English machine translation of news articles. The first problem is the quality of parallel corpora. Neural machine translation (NMT) systems suffer degraded performance when trained with noisy data. Because there is no clean Japanese-English parallel data for news articles, we build a novel parallel news corpus consisting of Japanese news articles translated into English in a content-equivalent manner. This is the first content-equivalent Japanese-English news corpus translated specifically for training NMT systems. The second problem involves the domain-adaptation technique. NMT systems suffer degraded performance when trained with mixed data having different features, such as noisy data and clean data. Though the existing methods try to overcome this problem by using tags for distinguishing the differences between corpora, it is not sufficient. We thus extend a domain-adaptation method using multi-tags to train an NMT model effectively with the clean corpus and existing parallel news corpora with some types of noise. Experimental results show that our corpus increases the translation quality, and that our domain-adaptation method is more effective for learning with the multiple types of corpora than existing domain-adaptation methods are.
pdf
bib
abs
Content-Equivalent Translated Parallel News Corpus and Extension of Domain Adaptation for NMT
Hideya Mino
|
Hideki Tanaka
|
Hitoshi Ito
|
Isao Goto
|
Ichiro Yamada
|
Takenobu Tokunaga
Proceedings of the Twelfth Language Resources and Evaluation Conference
In this paper, we deal with two problems in Japanese-English machine translation of news articles. The first problem is the quality of parallel corpora. Neural machine translation (NMT) systems suffer degraded performance when trained with noisy data. Because there is no clean Japanese-English parallel data for news articles, we build a novel parallel news corpus consisting of Japanese news articles translated into English in a content-equivalent manner. This is the first content-equivalent Japanese-English news corpus translated specifically for training NMT systems. The second problem involves the domain-adaptation technique. NMT systems suffer degraded performance when trained with mixed data having different features, such as noisy data and clean data. Though the existing methods try to overcome this problem by using tags for distinguishing the differences between corpora, it is not sufficient. We thus extend a domain-adaptation method using multi-tags to train an NMT model effectively with the clean corpus and existing parallel news corpora with some types of noise. Experimental results show that our corpus increases the translation quality, and that our domain-adaptation method is more effective for learning with the multiple types of corpora than existing domain-adaptation methods are.
pdf
bib
Proceedings of the 7th Workshop on Asian Translation
Toshiaki Nakazawa
|
Hideki Nakayama
|
Chenchen Ding
|
Raj Dabre
|
Anoop Kunchukuttan
|
Win Pa Pa
|
Ondřej Bojar
|
Shantipriya Parida
|
Isao Goto
|
Hidaya Mino
|
Hiroshi Manabe
|
Katsuhito Sudoh
|
Sadao Kurohashi
|
Pushpak Bhattacharyya
Proceedings of the 7th Workshop on Asian Translation
pdf
bib
abs
Overview of the 7th Workshop on Asian Translation
Toshiaki Nakazawa
|
Hideki Nakayama
|
Chenchen Ding
|
Raj Dabre
|
Shohei Higashiyama
|
Hideya Mino
|
Isao Goto
|
Win Pa Pa
|
Anoop Kunchukuttan
|
Shantipriya Parida
|
Ondřej Bojar
|
Sadao Kurohashi
Proceedings of the 7th Workshop on Asian Translation
This paper presents the results of the shared tasks from the 7th workshop on Asian translation (WAT2020). For the WAT2020, 20 teams participated in the shared tasks and 14 teams submitted their translation results for the human evaluation. We also received 12 research paper submissions out of which 7 were accepted. About 500 translation results were submitted to the automatic evaluation server, and selected submissions were manually evaluated.
pdf
bib
abs
Neural Machine Translation Using Extracted Context Based on Deep Analysis for the Japanese-English Newswire Task at WAT 2020
Isao Goto
|
Hideya Mino
|
Hitoshi Ito
|
Kazutaka Kinugawa
|
Ichiro Yamada
|
Hideki Tanaka
Proceedings of the 7th Workshop on Asian Translation
This paper describes the system of the NHK-NES team for the WAT 2020 Japanese–English newswire task. There are two main problems in Japanese-English news translation: translation of dropped subjects and compatibility between equivalent translations and English news-style outputs. We address these problems by extracting subjects from the context based on predicate-argument structures and using them as additional inputs, and constructing parallel Japanese-English news sentences equivalently translated from English news sentences. The evaluation results confirm the effectiveness of our context-utilization method.
2019
pdf
bib
Proceedings of the 6th Workshop on Asian Translation
Toshiaki Nakazawa
|
Chenchen Ding
|
Raj Dabre
|
Anoop Kunchukuttan
|
Nobushige Doi
|
Yusuke Oda
|
Ondřej Bojar
|
Shantipriya Parida
|
Isao Goto
|
Hidaya Mino
Proceedings of the 6th Workshop on Asian Translation
pdf
bib
abs
Overview of the 6th Workshop on Asian Translation
Toshiaki Nakazawa
|
Nobushige Doi
|
Shohei Higashiyama
|
Chenchen Ding
|
Raj Dabre
|
Hideya Mino
|
Isao Goto
|
Win Pa Pa
|
Anoop Kunchukuttan
|
Yusuke Oda
|
Shantipriya Parida
|
Ondřej Bojar
|
Sadao Kurohashi
Proceedings of the 6th Workshop on Asian Translation
This paper presents the results of the shared tasks from the 6th workshop on Asian translation (WAT2019) including Ja↔En, Ja↔Zh scientific paper translation subtasks, Ja↔En, Ja↔Ko, Ja↔En patent translation subtasks, Hi↔En, My↔En, Km↔En, Ta↔En mixed domain subtasks and Ru↔Ja news commentary translation task. For the WAT2019, 25 teams participated in the shared tasks. We also received 10 research paper submissions out of which 61 were accepted. About 400 translation results were submitted to the automatic evaluation server, and selected submis- sions were manually evaluated.
pdf
bib
abs
Neural Machine Translation System using a Content-equivalently Translated Parallel Corpus for the Newswire Translation Tasks at WAT 2019
Hideya Mino
|
Hitoshi Ito
|
Isao Goto
|
Ichiro Yamada
|
Hideki Tanaka
|
Takenobu Tokunaga
Proceedings of the 6th Workshop on Asian Translation
This paper describes NHK and NHK Engineering System (NHK-ES)’s submission to the newswire translation tasks of WAT 2019 in both directions of Japanese→English and English→Japanese. In addition to the JIJI Corpus that was officially provided by the task organizer, we developed a corpus of 0.22M sentence pairs by manually, translating Japanese news sentences into English content- equivalently. The content-equivalent corpus was effective for improving translation quality, and our systems achieved the best human evaluation scores in the newswire translation tasks at WAT 2019.
2018
pdf
bib
Overview of the 5th Workshop on Asian Translation
Toshiaki Nakazawa
|
Katsuhito Sudoh
|
Shohei Higashiyama
|
Chenchen Ding
|
Raj Dabre
|
Hideya Mino
|
Isao Goto
|
Win Pa Pa
|
Anoop Kunchukuttan
|
Sadao Kurohashi
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation: 5th Workshop on Asian Translation
2017
pdf
bib
abs
Detecting Untranslated Content for Neural Machine Translation
Isao Goto
|
Hideki Tanaka
Proceedings of the First Workshop on Neural Machine Translation
Despite its promise, neural machine translation (NMT) has a serious problem in that source content may be mistakenly left untranslated. The ability to detect untranslated content is important for the practical use of NMT. We evaluate two types of probability with which to detect untranslated content: the cumulative attention (ATN) probability and back translation (BT) probability from the target sentence to the source sentence. Experiments on detecting untranslated content in Japanese-English patent translations show that ATN and BT are each more effective than random choice, BT is more effective than ATN, and the combination of the two provides further improvements. We also confirmed the effectiveness of using ATN and BT to rerank the n-best NMT outputs.
pdf
bib
Proceedings of the 4th Workshop on Asian Translation (WAT2017)
Toshiaki Nakazawa
|
Isao Goto
Proceedings of the 4th Workshop on Asian Translation (WAT2017)
pdf
bib
abs
Overview of the 4th Workshop on Asian Translation
Toshiaki Nakazawa
|
Shohei Higashiyama
|
Chenchen Ding
|
Hideya Mino
|
Isao Goto
|
Hideto Kazawa
|
Yusuke Oda
|
Graham Neubig
|
Sadao Kurohashi
Proceedings of the 4th Workshop on Asian Translation (WAT2017)
This paper presents the results of the shared tasks from the 4th workshop on Asian translation (WAT2017) including J↔E, J↔C scientific paper translation subtasks, C↔J, K↔J, E↔J patent translation subtasks, H↔E mixed domain subtasks, J↔E newswire subtasks and J↔E recipe subtasks. For the WAT2017, 12 institutions participated in the shared tasks. About 300 translation results have been submitted to the automatic evaluation server, and selected submissions were manually evaluated.
2016
pdf
bib
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)
Toshiaki Nakazawa
|
Hideya Mino
|
Chenchen Ding
|
Isao Goto
|
Graham Neubig
|
Sadao Kurohashi
|
Ir. Hammam Riza
|
Pushpak Bhattacharyya
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)
pdf
bib
abs
Overview of the 3rd Workshop on Asian Translation
Toshiaki Nakazawa
|
Chenchen Ding
|
Hideya Mino
|
Isao Goto
|
Graham Neubig
|
Sadao Kurohashi
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)
This paper presents the results of the shared tasks from the 3rd workshop on Asian translation (WAT2016) including J ↔ E, J ↔ C scientific paper translation subtasks, C ↔ J, K ↔ J, E ↔ J patent translation subtasks, I ↔ E newswire subtasks and H ↔ E, H ↔ J mixed domain subtasks. For the WAT2016, 15 institutions participated in the shared tasks. About 500 translation results have been submitted to the automatic evaluation server, and selected submissions were manually evaluated.
2015
pdf
bib
Japanese news simplification: tak design, data set construction, and analysis of simplified text
Isao Goto
|
Hideki Tanaka
|
Tadashi Kumano
Proceedings of Machine Translation Summit XV: Papers
pdf
bib
The “News Web Easy” news service as a resource for teaching and learning Japanese: An assessment of the comprehension difficulty of Japanese sentence-end expressions
Hideki Tanaka
|
Tadashi Kumano
|
Isao Goto
Proceedings of the 2nd Workshop on Natural Language Processing Techniques for Educational Applications
bib
Proceedings of the 2nd Workshop on Asian Translation (WAT2015)
Toshiaki Nakazawa
|
Hideya Mino
|
Isao Goto
|
Graham Neubig
|
Sadao Kurohashi
|
Eiichiro Sumita
Proceedings of the 2nd Workshop on Asian Translation (WAT2015)
pdf
bib
Overview of the 2nd Workshop on Asian Translation
Toshiaki Nakazawa
|
Hideya Mino
|
Isao Goto
|
Graham Neubig
|
Sadao Kurohashi
|
Eiichiro Sumita
Proceedings of the 2nd Workshop on Asian Translation (WAT2015)
2014
pdf
bib
Proceedings of the 1st Workshop on Asian Translation (WAT2014)
Toshiaki Nakazawa
|
Hideya Mino
|
Isao Goto
|
Sadao Kurohashi
|
Eiichiro Sumita
Proceedings of the 1st Workshop on Asian Translation (WAT2014)
pdf
bib
Overview of the 1st Workshop on Asian Translation
Toshiaki Nakazawa
|
Hideya Mino
|
Isao Goto
|
Sadao Kurohashi
|
Eiichiro Sumita
Proceedings of the 1st Workshop on Asian Translation (WAT2014)
2013
pdf
bib
Converting Continuous-Space Language Models into N-Gram Language Models for Statistical Machine Translation
Rui Wang
|
Masao Utiyama
|
Isao Goto
|
Eiichro Sumita
|
Hai Zhao
|
Bao-Liang Lu
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
pdf
bib
Distortion Model Considering Rich Context for Statistical Machine Translation
Isao Goto
|
Masao Utiyama
|
Eiichiro Sumita
|
Akihiro Tamura
|
Sadao Kurohashi
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
2012
pdf
bib
Post-ordering by Parsing for Japanese-English Statistical Machine Translation
Isao Goto
|
Masao Utiyama
|
Eiichiro Sumita
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
2011
pdf
bib
A Comparison Study of Parsers for Patent Machine Translation
Isao Goto
|
Masao Utiyama
|
Takashi Onishi
|
Eiichiro Sumita
Proceedings of Machine Translation Summit XIII: Papers
2004
pdf
bib
Back Transliteration from Japanese to English using Target English Context
Isao Goto
|
Naoto Kato
|
Terumasa Ehara
|
Hideki Tanaka
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics
2003
pdf
bib
abs
Transliteration considering context information based on the maximum entropy method
Isao Goto
|
Naoto Kato
|
Noriyoshi Uratani
|
Terumasa Ehara
Proceedings of Machine Translation Summit IX: Papers
This paper proposes a method of automatic transliteration from English to Japanese words. Our method successfully transliterates an English word not registered in any bilingual or pronunciation dictionaries by converting each partial letters in the English word into Japanese katakana characters. In such transliteration, identical letters occurring in different English words must often be converted into different katakana. To produce an adequate transliteration, the proposed method considers chunking of alphabetic letters of an English word into conversion units and considers English and Japanese context information simultaneously to calculate the plausibility of conversion. We have confirmed experimentally that the proposed method improves the conversion accuracy by 63% compared to a simple method that ignores the plausibility of chunking and contextual information.
pdf
bib
abs
A multi-language translation example browser
Isao Goto
|
Naoto Kato
|
Noriyoshi Uratani
|
Terumasa Ehara
|
Tadashi Kumano
|
Hideki Tanaka
Proceedings of Machine Translation Summit IX: System Presentations
This paper describes a Multi-language Translation Example Browser, a type of translation memory system. The system is able to retrieve translation examples from bilingual news databases, which consist of news transcripts of past broadcasts. We put a Japanese-English system to practical use and undertook trial operations of a system of eight language-pairs.