<?xml version="1.0" encoding="UTF-8" ?>
<volume id="W17">
  <paper id="5700">
    <title>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</title>
    <editor>Toshiaki Nakazawa</editor>
    <editor>Isao Goto</editor>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <url>http://www.aclweb.org/anthology/W17-57</url>
    <bibtype>book</bibtype>
    <bibkey>WAT2017:2017</bibkey>
  </paper>

  <paper id="5701">
    <title>Overview of the 4th Workshop on Asian Translation</title>
    <author><first>Toshiaki</first><last>Nakazawa</last></author>
    <author><first>Shohei</first><last>Higashiyama</last></author>
    <author><first>Chenchen</first><last>Ding</last></author>
    <author><first>Hideya</first><last>Mino</last></author>
    <author><first>Isao</first><last>Goto</last></author>
    <author><first>Hideto</first><last>Kazawa</last></author>
    <author><first>Yusuke</first><last>Oda</last></author>
    <author><first>Graham</first><last>Neubig</last></author>
    <author><first>Sadao</first><last>Kurohashi</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>1&#8211;54</pages>
    <url>http://www.aclweb.org/anthology/W17-5701</url>
    <abstract>This paper presents the results of the shared tasks from the 4th workshop on
	Asian translation (WAT2017) including J2E, J2C scientific paper translation
	subtasks, C2J, K2J, E2J patent translation subtasks, H2E mixed domain
	subtasks, J2E newswire subtasks and J2E recipe subtasks. For the WAT2017,
	12 institutions participated in the shared tasks. About 300 translation results
	have been submitted to the automatic evaluation server, and selected
	submissions were manually evaluated.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>nakazawa-EtAl:2017:WAT2017</bibkey>
  </paper>

  <paper id="5702">
    <title>Controlling Target Features in Neural Machine Translation via Prefix Constraints</title>
    <author><first>Shunsuke</first><last>Takeno</last></author>
    <author><first>Masaaki</first><last>Nagata</last></author>
    <author><first>Kazuhide</first><last>Yamamoto</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>55&#8211;63</pages>
    <url>http://www.aclweb.org/anthology/W17-5702</url>
    <abstract>We propose prefix constraints  a novel method to enforce
	  constraints on target sentences in neural machine translation. It
	  places a sequence of special tokens at the beginning of target
	  sentence (target prefix), while side constraints
	  places a special token at the end of
	  source sentence (source suffix). Prefix constraints can be predicted
	  from source sentence jointly with target sentence, while side
	  constraints must be provided by the user or predicted by some other
	  methods. In both methods, special tokens are designed to encode
	  arbitrary features on target-side or metatextual information. We
	  show that prefix constraints are more flexible than side constraints
	  and can be used to control the behavior of neural machine
	  translation, in terms of output length, bidirectional decoding,
	  domain adaptation, and unaligned target word generation.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>takeno-nagata-yamamoto:2017:WAT2017</bibkey>
  </paper>

  <paper id="5703">
    <title>Improving Japanese-to-English Neural Machine Translation by Paraphrasing the Target Language</title>
    <author><first>Yuuki</first><last>Sekizawa</last></author>
    <author><first>Tomoyuki</first><last>Kajiwara</last></author>
    <author><first>Mamoru</first><last>Komachi</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>64&#8211;69</pages>
    <url>http://www.aclweb.org/anthology/W17-5703</url>
    <abstract>Neural machine translation (NMT) produces sentences that are more fluent than
	those produced by statistical machine translation (SMT). However, NMT has a
	very high computational cost because of the high dimensionality of the output
	layer. Generally, NMT restricts the size of vocabulary, which results in
	infrequent words being treated as out-of-vocabulary (OOV) and degrades the
	performance of the translation. In evaluation, we achieved a statistically
	significant BLEU score improvement of 0.55-0.77 over the baselines including
	the state-of-the-art method.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>sekizawa-kajiwara-komachi:2017:WAT2017</bibkey>
  </paper>

  <paper id="5704">
    <title>Improving Low-Resource Neural Machine Translation with Filtered Pseudo-Parallel Corpus</title>
    <author><first>Aizhan</first><last>Imankulova</last></author>
    <author><first>Takayuki</first><last>Sato</last></author>
    <author><first>Mamoru</first><last>Komachi</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>70&#8211;78</pages>
    <url>http://www.aclweb.org/anthology/W17-5704</url>
    <abstract>Large-scale parallel corpora are indispensable to train highly accurate machine
	translators.
	However, manually constructed large-scale parallel corpora are not freely
	available in many language pairs.
	In previous studies, training data have been expanded using a pseudo-parallel
	corpus obtained using machine translation of the monolingual corpus in the
	target language.
	However, in low-resource language pairs in which only low-accuracy machine
	translation systems can be used, translation quality is reduces when a
	pseudo-parallel corpus is used naively.
	To improve machine translation performance with low-resource language pairs, we
	propose a method to expand the training data effectively via filtering the
	pseudo-parallel corpus using a quality estimation based on back-translation.
	As a result of experiments with three language pairs using small, medium, and
	large parallel corpora, language pairs with fewer training data filtered out
	more sentence pairs and improved BLEU scores more significantly.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>imankulova-sato-komachi:2017:WAT2017</bibkey>
  </paper>

  <paper id="5705">
    <title>Japanese to English/Chinese/Korean Datasets for Translation Quality Estimation and Automatic Post-Editing</title>
    <author><first>Atsushi</first><last>Fujita</last></author>
    <author><first>Eiichiro</first><last>Sumita</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>79&#8211;88</pages>
    <url>http://www.aclweb.org/anthology/W17-5705</url>
    <abstract>Aiming at facilitating the research on quality estimation (QE) and
	automatic post-editing (APE) of machine translation (MT) outputs,
	especially for those among Asian languages, we have created new
	datasets for Japanese to English, Chinese, and Korean translations.
	As the source text, actual utterances in Japanese were extracted from the
	log data of our speech translation service.  MT outputs were then given by
	phrase-based
	statistical MT systems.  Finally, human evaluators were employed to
	grade the quality of MT outputs and to post-edit them.
	This paper describes the characteristics of the created datasets and
	reports on our benchmarking experiments on word-level QE,
	sentence-level QE, and APE conducted using the created datasets.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>fujita-sumita:2017:WAT2017</bibkey>
  </paper>

  <paper id="5706">
    <title>NTT Neural Machine Translation Systems at WAT 2017</title>
    <author><first>Makoto</first><last>Morishita</last></author>
    <author><first>Jun</first><last>Suzuki</last></author>
    <author><first>Masaaki</first><last>Nagata</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>89&#8211;94</pages>
    <url>http://www.aclweb.org/anthology/W17-5706</url>
    <abstract>In this year, we participated in four translation subtasks at WAT 2017.
	Our model structure is quite simple but we used it with well-tuned
	hyper-parameters, leading to a significant improvement compared to the previous
	state-of-the-art system.
	We also tried to make use of the unreliable part of the provided parallel
	corpus by back-translating and making a synthetic corpus.
	Our submitted system achieved the new state-of-the-art performance in terms of
	the BLEU score, as well as human evaluation.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>morishita-suzuki-nagata:2017:WAT2017</bibkey>
  </paper>

  <paper id="5707">
    <title>XMU Neural Machine Translation Systems for WAT 2017</title>
    <author><first>Boli</first><last>Wang</last></author>
    <author><first>Zhixing</first><last>Tan</last></author>
    <author><first>Jinming</first><last>Hu</last></author>
    <author><first>Yidong</first><last>Chen</last></author>
    <author><first>xiaodong</first><last>shi</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>95&#8211;98</pages>
    <url>http://www.aclweb.org/anthology/W17-5707</url>
    <abstract>This paper describes the Neural Machine Translation systems of Xiamen
	University for the shared translation tasks of WAT 2017. Our systems are based
	on the Encoder-Decoder framework with attention. We participated in three
	subtasks. We experimented subword segmentation, synthetic training data and
	model ensembling. Experiments show that all these methods can give substantial
	improvements.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>wang-EtAl:2017:WAT2017</bibkey>
  </paper>

  <paper id="5708">
    <title>A Bag of Useful Tricks for Practical Neural Machine Translation: Embedding Layer Initialization and Large Batch Size</title>
    <author><first>Masato</first><last>Neishi</last></author>
    <author><first>Jin</first><last>Sakuma</last></author>
    <author><first>Satoshi</first><last>Tohda</last></author>
    <author><first>Shonosuke</first><last>Ishiwatari</last></author>
    <author><first>Naoki</first><last>Yoshinaga</last></author>
    <author><first>Masashi</first><last>Toyoda</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>99&#8211;109</pages>
    <url>http://www.aclweb.org/anthology/W17-5708</url>
    <abstract>In this paper, we describe the team UT-IIS's system and results for the WAT
	2017 translation tasks. We further investigated several tricks including a
	novel technique for initializing embedding layers using only the parallel
	corpus, which increased the BLEU score by 1.28, found a practical large batch
	size of 256, and gained insights regarding hyperparameter settings. Ultimately,
	our system obtained a better result than the state-of-the-art system of WAT
	2016. Our code is available on https://github.com/nem6ishi/wat17.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>neishi-EtAl:2017:WAT2017</bibkey>
  </paper>

  <paper id="5709">
    <title>Patent NMT integrated with Large Vocabulary Phrase Translation by SMT at WAT 2017</title>
    <author><first>Zi</first><last>Long</last></author>
    <author><first>Ryuichiro</first><last>Kimura</last></author>
    <author><first>Takehito</first><last>Utsuro</last></author>
    <author><first>Tomoharu</first><last>Mitsuhashi</last></author>
    <author><first>Mikio</first><last>Yamamoto</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>110&#8211;118</pages>
    <url>http://www.aclweb.org/anthology/W17-5709</url>
    <abstract>Neural machine translation (NMT) cannot handle a larger vocabulary
	 because the training complexity and decoding complexity proportionally
	 increase with the number of target words. This problem becomes even
	 more serious when translating patent documents, which contain many
	 technical terms that are observed infrequently.  Long et al.(2017)
	 proposed to select phrases that contain out-of-vocabulary words using
	 the statistical approach of branching entropy.  The selected phrases
	 are then replaced with tokens during training and post-translated by
	 the phrase translation table of SMT.  In this paper, we apply the
	 method proposed by Long et al. (2017) to the WAT 2017 Japanese-Chinese
	 and Japanese-English patent datasets.                                     
	Evaluation on
	 Japanese-to-Chinese, Chinese-to-Japanese, Japanese-to-English and
	 English-to-Japanese patent sentence translation proved the
	 effectiveness of phrases selected with branching entropy, where the NMT
	 model of Long et al.(2017) achieves a substantial improvement over a
	 baseline NMT model without the technique proposed by Long et al.(2017).</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>long-EtAl:2017:WAT2017</bibkey>
  </paper>

  <paper id="5710">
    <title>SMT reranked NMT</title>
    <author><first>Terumasa</first><last>Ehara</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>119&#8211;126</pages>
    <url>http://www.aclweb.org/anthology/W17-5710</url>
    <abstract>System architecture, experimental settings and experimental results of the EHR
	team for the WAT2017 tasks are described. We participate in three tasks:
	JPCen-ja, JPCzh-ja and JPCko-ja. Although the basic architecture of our system
	is NMT, reranking technique is conducted using SMT results. One of the major
	drawback of NMT is under-translation and over-translation. On the other hand,
	SMT infrequently makes such translations. So, using reranking of n-best NMT
	outputs by the SMT output, discarding such translations can be expected. We can
	improve BLEU score from 46.03 to 47.08 by this technique in JPCzh-ja task.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>ehara:2017:WAT2017</bibkey>
  </paper>

  <paper id="5711">
    <title>Ensemble and Reranking: Using Multiple Models in the NICT-2 Neural Machine Translation System at WAT2017</title>
    <author><first>Kenji</first><last>Imamura</last></author>
    <author><first>Eiichiro</first><last>Sumita</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>127&#8211;134</pages>
    <url>http://www.aclweb.org/anthology/W17-5711</url>
    <abstract>In this paper, we describe the NICT-2 neural machine translation system
	evaluated at WAT2017.  This system uses multiple models as an ensemble and
	combines models with opposite decoding directions by reranking (called
	bi-directional reranking).
	In our experimental results on small data sets, the translation quality
	improved when the number of models was increased to 32 in total and did not
	saturate.  In the experiments on large data sets, improvements of 1.59-3.32
	BLEU points were achieved when six-model ensembles were combined by the
	bi-directional reranking.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>imamura-sumita:2017:WAT2017</bibkey>
  </paper>

  <paper id="5712">
    <title>A Simple and Strong Baseline: NAIST-NICT Neural Machine Translation System for WAT2017 English-Japanese Translation Task</title>
    <author><first>Yusuke</first><last>Oda</last></author>
    <author><first>Katsuhito</first><last>Sudoh</last></author>
    <author><first>Satoshi</first><last>Nakamura</last></author>
    <author><first>Masao</first><last>Utiyama</last></author>
    <author><first>Eiichiro</first><last>Sumita</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>135&#8211;139</pages>
    <url>http://www.aclweb.org/anthology/W17-5712</url>
    <abstract>This paper describes the details about the NAIST-NICT machine translation
	system for WAT2017 English-Japanese Scientific Paper Translation Task. The
	system consists of a language-independent tokenizer and an attentional
	encoder-decoder style neural machine translation model. According to the
	official results, our system achieves higher translation accuracy than any
	systems submitted previous campaigns despite simple model architecture.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>oda-EtAl:2017:WAT2017</bibkey>
  </paper>

  <paper id="5713">
    <title>Comparison of SMT and NMT trained with large Patent Corpora: Japio at WAT2017</title>
    <author><first>Satoshi</first><last>Kinoshita</last></author>
    <author><first>Tadaaki</first><last>Oshio</last></author>
    <author><first>Tomoharu</first><last>Mitsuhashi</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>140&#8211;145</pages>
    <url>http://www.aclweb.org/anthology/W17-5713</url>
    <abstract>Japio participates in patent subtasks (JPC-EJ/JE/CJ/KJ) with phrase-based
	statistical machine translation (SMT) and neural machine translation (NMT)
	systems which are trained with its own patent corpora in addition to the
	subtask corpora provided by organizers of WAT2017.  In EJ and CJ subtasks, SMT
	and NMT systems whose sizes of training corpora are about 50 million and 10
	million sentence pairs respectively achieved comparable scores for automatic
	evaluations, but NMT systems were superior to SMT systems for both official and
	in-house human evaluations.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>kinoshita-oshio-mitsuhashi:2017:WAT2017</bibkey>
  </paper>

  <paper id="5714">
    <title>Kyoto University Participation to WAT 2017</title>
    <author><first>Fabien</first><last>Cromieres</last></author>
    <author><first>Raj</first><last>Dabre</last></author>
    <author><first>Toshiaki</first><last>Nakazawa</last></author>
    <author><first>Sadao</first><last>Kurohashi</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>146&#8211;153</pages>
    <url>http://www.aclweb.org/anthology/W17-5714</url>
    <abstract>We describe here our approaches and results on the WAT 2017 shared translation
	  tasks. Following our good results with Neural Machine Translation in the
	  previous shared task, we continue this approach this year, with incremental
	  improvements in models and training methods. We focused on the ASPEC dataset
	  and could improve the state-of-the-art results for Chinese-to-Japanese and
	  Japanese-to-Chinese translations.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>cromieres-EtAl:2017:WAT2017</bibkey>
  </paper>

  <paper id="5715">
    <title>CUNI NMT System for WAT 2017 Translation Tasks</title>
    <author><first>Tom</first><last>Kocmi</last></author>
    <author><first>Du&#x161;an</first><last>Vari&#x161;</last></author>
    <author><first>Ond&#x159;ej</first><last>Bojar</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>154&#8211;159</pages>
    <url>http://www.aclweb.org/anthology/W17-5715</url>
    <abstract>The paper presents this year's CUNI submissions to the WAT 2017 Translation
	Task focusing on the Japanese-English translation, namely Scientific papers
	subtask,
	Patents subtask and Newswire subtask. We compare two neural network
	architectures, the standard sequence-to-sequence with attention (Seq2Seq) and
	an
	architecture using convolutional sentence encoder (FBConv2Seq), both
	implemented in the NMT framework Neural Monkey that we currently participate in
	developing.
	We also compare various types of preprocessing of the source Japanese sentences
	and their impact on the overall results. Furthermore, we include the results of
	our
	experiments with out-of-domain data obtained by combining the corpora provided
	for each
	subtask.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>kocmi-varivs-bojar:2017:WAT2017</bibkey>
  </paper>

  <paper id="5716">
    <title>Tokyo Metropolitan University Neural Machine Translation System for WAT 2017</title>
    <author><first>Yukio</first><last>Matsumura</last></author>
    <author><first>Mamoru</first><last>Komachi</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>160&#8211;166</pages>
    <url>http://www.aclweb.org/anthology/W17-5716</url>
    <abstract>In this paper, we describe our neural machine translation (NMT) system, which
	is based on the attention-based NMT and uses long short-term memories (LSTM) as
	RNN. We implemented beam search and ensemble decoding in the NMT system. The
	system was tested on the 4th Workshop on Asian Translation (WAT 2017) shared
	tasks. In our experiments, we participated in the scientific paper subtasks and
	attempted Japanese-English, English-Japanese, and Japanese-Chinese translation
	tasks. The experimental results showed that implementation of beam search and
	ensemble decoding can effectively improve the translation quality.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>matsumura-komachi:2017:WAT2017</bibkey>
  </paper>

  <paper id="5717">
    <title>Comparing Recurrent and Convolutional Architectures for English-Hindi Neural Machine Translation</title>
    <author><first>Sandhya</first><last>Singh</last></author>
    <author><first>Ritesh</first><last>Panjwani</last></author>
    <author><first>Anoop</first><last>Kunchukuttan</last></author>
    <author><first>Pushpak</first><last>Bhattacharyya</last></author>
    <booktitle>Proceedings of the 4th Workshop on Asian Translation (WAT2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Asian Federation of Natural Language Processing</publisher>
    <pages>167&#8211;170</pages>
    <url>http://www.aclweb.org/anthology/W17-5717</url>
    <abstract>In this paper, we empirically compare
	the two encoder-decoder neural machine
	translation architectures: convolutional se-
	quence to sequence model (ConvS2S) and
	recurrent sequence to sequence model
	(RNNS2S) for English-Hindi language
	pair as part of IIT Bombay’s submission
	to WAT2017 shared task. We report the
	results for both English-Hindi and Hindi-
	English direction of language pair.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>singh-EtAl:2017:WAT2017</bibkey>
  </paper>

</volume>

