Proceedings of the 6th Workshop on Asian Translation

Toshiaki Nakazawa, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Nobushige Doi, Yusuke Oda, Ondřej Bojar, Shantipriya Parida, Isao Goto, Hidaya Mino (Editors)


Anthology ID:
D19-52
Month:
November
Year:
2019
Address:
Hong Kong, China
Venues:
EMNLP | WAT | WS
SIG:
Publisher:
Association for Computational Linguistics
URL:
https://aclanthology.org/D19-52
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
https://aclanthology.org/D19-52.pdf

pdf bib
Proceedings of the 6th Workshop on Asian Translation
Toshiaki Nakazawa | Chenchen Ding | Raj Dabre | Anoop Kunchukuttan | Nobushige Doi | Yusuke Oda | Ondřej Bojar | Shantipriya Parida | Isao Goto | Hidaya Mino

pdf bib
Overview of the 6th Workshop on Asian Translation
Toshiaki Nakazawa | Nobushige Doi | Shohei Higashiyama | Chenchen Ding | Raj Dabre | Hideya Mino | Isao Goto | Win Pa Pa | Anoop Kunchukuttan | Yusuke Oda | Shantipriya Parida | Ondřej Bojar | Sadao Kurohashi

This paper presents the results of the shared tasks from the 6th workshop on Asian translation (WAT2019) including Ja↔En, Ja↔Zh scientific paper translation subtasks, Ja↔En, Ja↔Ko, Ja↔En patent translation subtasks, Hi↔En, My↔En, Km↔En, Ta↔En mixed domain subtasks and Ru↔Ja news commentary translation task. For the WAT2019, 25 teams participated in the shared tasks. We also received 10 research paper submissions out of which 61 were accepted. About 400 translation results were submitted to the automatic evaluation server, and selected submis- sions were manually evaluated.

pdf bib
Compact and Robust Models for Japanese-English Character-level Machine Translation
Jinan Dai | Kazunori Yamaguchi

Character-level translation has been proved to be able to achieve preferable translation quality without explicit segmentation, but training a character-level model needs a lot of hardware resources. In this paper, we introduced two character-level translation models which are mid-gated model and multi-attention model for Japanese-English translation. We showed that the mid-gated model achieved the better performance with respect to BLEU scores. We also showed that a relatively narrow beam of width 4 or 5 was sufficient for the mid-gated model. As for unknown words, we showed that the mid-gated model could somehow translate the one containing Katakana by coining out a close word. We also showed that the model managed to produce tolerable results for heavily noised sentences, even though the model was trained with the dataset without noise.

pdf bib
Controlling Japanese Honorifics in English-to-Japanese Neural Machine Translation
Weston Feely | Eva Hasler | Adrià de Gispert

In the Japanese language different levels of honorific speech are used to convey respect, deference, humility, formality and social distance. In this paper, we present a method for controlling the level of formality of Japanese output in English-to-Japanese neural machine translation (NMT). By using heuristics to identify honorific verb forms, we classify Japanese sentences as being one of three levels of informal, polite, or formal speech in parallel text. The English source side is marked with a feature that identifies the level of honorific speech present in the Japanese target side. We use this parallel text to train an English-Japanese NMT model capable of producing Japanese translations in different honorific speech styles for the same English input sentence.

pdf bib
Designing the Business Conversation Corpus
Matīss Rikters | Ryokan Ri | Tong Li | Toshiaki Nakazawa

While the progress of machine translation of written text has come far in the past several years thanks to the increasing availability of parallel corpora and corpora-based training technologies, automatic translation of spoken text and dialogues remains challenging even for modern systems. In this paper, we aim to boost the machine translation quality of conversational texts by introducing a newly constructed Japanese-English business conversation parallel corpus. A detailed analysis of the corpus is provided along with challenging examples for automatic translation. We also experiment with adding the corpus in a machine translation training scenario and show how the resulting system benefits from its use.

pdf bib
English to Hindi Multi-modal Neural Machine Translation and Hindi Image Captioning
Sahinur Rahman Laskar | Rohit Pratap Singh | Partha Pakray | Sivaji Bandyopadhyay

With the widespread use of Machine Trans-lation (MT) techniques, attempt to minimizecommunication gap among people from di-verse linguistic backgrounds. We have par-ticipated in Workshop on Asian Transla-tion 2019 (WAT2019) multi-modal translationtask. There are three types of submissiontrack namely, multi-modal translation, Hindi-only image captioning and text-only transla-tion for English to Hindi translation. The mainchallenge is to provide a precise MT output.The multi-modal concept incorporates textualand visual features in the translation task. Inthis work, multi-modal translation track re-lies on pre-trained convolutional neural net-works (CNN) with Visual Geometry Grouphaving 19 layered (VGG19) to extract imagefeatures and attention-based Neural MachineTranslation (NMT) system for translation.The merge-model of recurrent neural network(RNN) and CNN is used for the Hindi-onlyimage captioning. The text-only translationtrack is based on the transformer model of theNMT system. The official results evaluated atWAT2019 translation task, which shows thatour multi-modal NMT system achieved Bilin-gual Evaluation Understudy (BLEU) score20.37, Rank-based Intuitive Bilingual Eval-uation Score (RIBES) 0.642838, Adequacy-Fluency Metrics (AMFM) score 0.668260 forchallenge test data and BLEU score 40.55,RIBES 0.760080, AMFM score 0.770860 forevaluation test data in English to Hindi multi-modal translation respectively.

pdf bib
Supervised and Unsupervised Machine Translation for Myanmar-English and Khmer-English
Benjamin Marie | Hour Kaing | Aye Myat Mon | Chenchen Ding | Atsushi Fujita | Masao Utiyama | Eiichiro Sumita

This paper presents the NICT’s supervised and unsupervised machine translation systems for the WAT2019 Myanmar-English and Khmer-English translation tasks. For all the translation directions, we built state-of-the-art supervised neural (NMT) and statistical (SMT) machine translation systems, using monolingual data cleaned and normalized. Our combination of NMT and SMT performed among the best systems for the four translation directions. We also investigated the feasibility of unsupervised machine translation for low-resource and distant language pairs and confirmed observations of previous work showing that unsupervised MT is still largely unable to deal with them.

pdf bib
NICT’s participation to WAT 2019: Multilingualism and Multi-step Fine-Tuning for Low Resource NMT
Raj Dabre | Eiichiro Sumita

In this paper we describe our submissions to WAT 2019 for the following tasks: English–Tamil translation and Russian–Japanese translation. Our team,“NICT-5”, focused on multilingual domain adaptation and back-translation for Russian–Japanese translation and on simple fine-tuning for English–Tamil translation . We noted that multi-stage fine tuning is essential in leveraging the power of multilingualism for an extremely low-resource language like Russian–Japanese. Furthermore, we can improve the performance of such a low-resource language pair by exploiting a small but in-domain monolingual corpus via back-translation. We managed to obtain second rank in both tasks for all translation directions.

pdf bib
KNU-HYUNDAI’s NMT system for Scientific Paper and Patent Tasks onWAT 2019
Cheoneum Park | Young-Jun Jung | Kihoon Kim | Geonyeong Kim | Jae-Won Jeon | Seongmin Lee | Junseok Kim | Changki Lee

In this paper, we describe the neural machine translation (NMT) system submitted by the Kangwon National University and HYUNDAI (KNU-HYUNDAI) team to the translation tasks of the 6th workshop on Asian Translation (WAT 2019). We participated in all tasks of ASPEC and JPC2, which included those of Chinese-Japanese, English-Japanese, and Korean->Japanese. We submitted our transformer-based NMT system with built using the following methods: a) relative positioning method for pairwise relationships between the input elements, b) back-translation and multi-source translation for data augmentation, c) right-to-left (r2l)-reranking model robust against error propagation in autoregressive architectures such as decoders, and d) checkpoint ensemble models, which selected the top three models with the best validation bilingual evaluation understudy (BLEU) . We have reported the translation results on the two aforementioned tasks. We performed well in both the tasks and were ranked first in terms of the BLEU scores in all the JPC2 subtasks we participated in.

pdf bib
English-Myanmar Supervised and Unsupervised NMT: NICT’s Machine Translation Systems at WAT-2019
Rui Wang | Haipeng Sun | Kehai Chen | Chenchen Ding | Masao Utiyama | Eiichiro Sumita

This paper presents the NICT’s participation (team ID: NICT) in the 6th Workshop on Asian Translation (WAT-2019) shared translation task, specifically Myanmar (Burmese) - English task in both translation directions. We built neural machine translation (NMT) systems for these tasks. Our NMT systems were trained with language model pretraining. Back-translation technology is adopted to NMT. Our NMT systems rank the third in English-to-Myanmar and the second in Myanmar-to-English according to BLEU score.

pdf bib
UCSMNLP: Statistical Machine Translation for WAT 2019
Aye Thida | Nway Nway Han | Sheinn Thawtar Oo | Khin Thet Htar

This paper represents UCSMNLP’s submission to the WAT 2019 Translation Tasks focusing on the Myanmar-English translation. Phrase based statistical machine translation (PBSMT) system is built by using other resources: Name Entity Recognition (NER) corpus and bilingual dictionary which is created by Google Translate (GT). This system is also adopted with listwise reranking process in order to improve the quality of translation and tuning is done by changing initial distortion weight. The experimental results show that PBSMT using other resources with initial distortion weight (0.4) and listwise reranking function outperforms the baseline system.

pdf bib
NTT Neural Machine Translation Systems at WAT 2019
Makoto Morishita | Jun Suzuki | Masaaki Nagata

In this paper, we describe our systems that were submitted to the translation shared tasks at WAT 2019. This year, we participated in two distinct types of subtasks, a scientific paper subtask and a timely disclosure subtask, where we only considered English-to-Japanese and Japanese-to-English translation directions. We submitted two systems (En-Ja and Ja-En) for the scientific paper subtask and two systems (Ja-En, texts, items) for the timely disclosure subtask. Three of our four systems obtained the best human evaluation performances. We also confirmed that our new additional web-crawled parallel corpus improves the performance in unconstrained settings.

pdf bib
Neural Machine Translation System using a Content-equivalently Translated Parallel Corpus for the Newswire Translation Tasks at WAT 2019
Hideya Mino | Hitoshi Ito | Isao Goto | Ichiro Yamada | Hideki Tanaka | Takenobu Tokunaga

This paper describes NHK and NHK Engineering System (NHK-ES)’s submission to the newswire translation tasks of WAT 2019 in both directions of Japanese→English and English→Japanese. In addition to the JIJI Corpus that was officially provided by the task organizer, we developed a corpus of 0.22M sentence pairs by manually, translating Japanese news sentences into English content- equivalently. The content-equivalent corpus was effective for improving translation quality, and our systems achieved the best human evaluation scores in the newswire translation tasks at WAT 2019.

pdf bib
Facebook AI’s WAT19 Myanmar-English Translation Task Submission
Peng-Jen Chen | Jiajun Shen | Matthew Le | Vishrav Chaudhary | Ahmed El-Kishky | Guillaume Wenzek | Myle Ott | Marc’Aurelio Ranzato

This paper describes Facebook AI’s submission to the WAT 2019 Myanmar-English translation task. Our baseline systems are BPE-based transformer models. We explore methods to leverage monolingual data to improve generalization, including self-training, back-translation and their combination. We further improve results by using noisy channel re-ranking and ensembling. We demonstrate that these techniques can significantly improve not only a system trained with additional monolingual data, but even the baseline system trained exclusively on the provided small parallel dataset. Our system ranks first in both directions according to human evaluation and BLEU, with a gain of over 8 BLEU points above the second best system.

pdf bib
Combining Translation Memory with Neural Machine Translation
Akiko Eriguchi | Spencer Rarrick | Hitokazu Matsushita

In this paper, we report our submission systems (geoduck) to the Timely Disclosure task on the 6th Workshop on Asian Translation (WAT) (Nakazawa et al., 2019). Our system employs a combined approach of translation memory and Neural Machine Translation (NMT) models, where we can select final translation outputs from either a translation memory or an NMT system, when the similarity score of a test source sentence exceeds the predefined threshold. We observed that this combination approach significantly improves the translation performance on the Timely Disclosure corpus, as compared to a standalone NMT system. We also conducted source-based direct assessment on the final output, and we discuss the comparison between human references and each system’s output.

pdf bib
CVIT’s submissions to WAT-2019
Jerin Philip | Shashank Siripragada | Upendra Kumar | Vinay Namboodiri | C V Jawahar

This paper describes the Neural Machine Translation systems used by IIIT Hyderabad (CVIT-MT) for the translation tasks part of WAT-2019. We participated in tasks pertaining to Indian languages and submitted results for English-Hindi, Hindi-English, English-Tamil and Tamil-English language pairs. We employ Transformer architecture experimenting with multilingual models and methods for low-resource languages.

pdf bib
LTRC-MT Simple & Effective Hindi-English Neural Machine Translation Systems at WAT 2019
Vikrant Goyal | Dipti Misra Sharma

This paper describes the Neural Machine Translation systems of IIIT-Hyderabad (LTRC-MT) for WAT 2019 Hindi-English shared task. We experimented with both Recurrent Neural Networks & Transformer architectures. We also show the results of our experiments of training NMT models using additional data via backtranslation.

pdf bib
Long Warm-up and Self-Training: Training Strategies of NICT-2 NMT System at WAT-2019
Kenji Imamura | Eiichiro Sumita

This paper describes the NICT-2 neural machine translation system at the 6th Workshop on Asian Translation. This system employs the standard Transformer model but features the following two characteristics. One is the long warm-up strategy, which performs a longer warm-up of the learning rate at the start of the training than conventional approaches. Another is that the system introduces self-training approaches based on multiple back-translations generated by sampling. We participated in three tasks—ASPEC.en-ja, ASPEC.ja-en, and TDDC.ja-en—using this system.

pdf bib
Supervised neural machine translation based on data augmentation and improved training & inference process
Yixuan Tong | Liang Liang | Boyan Liu | Shanshan Jiang | Bin Dong

This is the second time for SRCB to participate in WAT. This paper describes the neural machine translation systems for the shared translation tasks of WAT 2019. We participated in ASPEC tasks and submitted results on English-Japanese, Japanese-English, Chinese-Japanese, and Japanese-Chinese four language pairs. We employed the Transformer model as the baseline and experimented relative position representation, data augmentation, deep layer model, ensemble. Experiments show that all these methods can yield substantial improvements.

pdf bib
Sarah’s Participation in WAT 2019
Raymond Hendy Susanto | Ohnmar Htun | Liling Tan

This paper describes our MT systems’ participation in the of WAT 2019. We participated in the (i) Patent, (ii) Timely Disclosure, (iii) Newswire and (iv) Mixed-domain tasks. Our main focus is to explore how similar Transformer models perform on various tasks. We observed that for tasks with smaller datasets, our best model setup are shallower models with lesser number of attention heads. We investigated practical issues in NMT that often appear in production settings, such as coping with multilinguality and simplifying pre- and post-processing pipeline in deployment.

pdf bib
Our Neural Machine Translation Systems for WAT 2019
Wei Yang | Jun Ogata

In this paper, we describe our Neural Machine Translation (NMT) systems for the WAT 2019 translation tasks we focus on. This year we participate in scientific paper tasks and focus on the language pair between English and Japanese. We use Transformer model through our work in this paper to explore and experience the powerful of the Transformer architecture relying on self-attention mechanism. We use different NMT toolkit/library as the implementation of training the Transformer model. For word segmentation, we use different subword segmentation strategies while using different toolkit/library. We not only give the translation accuracy obtained based on absolute position encodings that introduced in the Transformer model, but also report the the improvements in translation accuracy while replacing absolute position encodings with relative position representations. We also ensemble several independent trained Transformer models to further improve the translation accuracy.

pdf bib
Japanese-Russian TMU Neural Machine Translation System using Multilingual Model for WAT 2019
Aizhan Imankulova | Masahiro Kaneko | Mamoru Komachi

We introduce our system that is submitted to the News Commentary task (Japanese<->Russian) of the 6th Workshop on Asian Translation. The goal of this shared task is to study extremely low resource situations for distant language pairs. It is known that using parallel corpora of different language pair as training data is effective for multilingual neural machine translation model in extremely low resource scenarios. Therefore, to improve the translation quality of Japanese<->Russian language pair, our method leverages other in-domain Japanese-English and English-Russian parallel corpora as additional training data for our multilingual NMT model.

pdf bib
NLPRL at WAT2019: Transformer-based Tamil – English Indic Task Neural Machine Translation System
Amit Kumar | Anil Kumar Singh

This paper describes the Machine Translation system for Tamil-English Indic Task organized at WAT 2019. We use Transformer- based architecture for Neural Machine Translation.

pdf bib
Idiap NMT System for WAT 2019 Multimodal Translation Task
Shantipriya Parida | Ondřej Bojar | Petr Motlicek

This paper describes the Idiap submission to WAT 2019 for the English-Hindi Multi-Modal Translation Task. We have used the state-of-the-art Transformer model and utilized the IITB English-Hindi parallel corpus as an additional data source. Among the different tracks of the multi-modal task, we have participated in the “Text-Only” track for the evaluation and challenge test sets. Our submission tops in its track among the competitors in terms of both automatic and manual evaluation. Based on automatic scores, our text-only submission also outperforms systems that consider visual information in the “multi-modal translation” task.

pdf bib
WAT2019: English-Hindi Translation on Hindi Visual Genome Dataset
Loitongbam Sanayai Meetei | Thoudam Doren Singh | Sivaji Bandyopadhyay

A multimodal translation is a task of translating a source language to a target language with the help of a parallel text corpus paired with images that represent the contextual details of the text. In this paper, we carried out an extensive comparison to evaluate the benefits of using a multimodal approach on translating text in English to a low resource language, Hindi as a part of WAT2019 shared task. We carried out the translation of English to Hindi in three separate tasks with both the evaluation and challenge dataset. First, by using only the parallel text corpora, then through an image caption generation approach and, finally with the multimodal approach. Our experiment shows a significant improvement in the result with the multimodal approach than the other approach.

pdf bib
SYSTRAN @ WAT 2019: Russian-Japanese News Commentary task
Jitao Xu | TuAnh Nguyen | MinhQuang Pham | Josep Crego | Jean Senellart

This paper describes Systran’s submissions to WAT 2019 Russian-Japanese News Commentary task. A challenging translation task due to the extremely low resources available and the distance of the language pair. We have used the neural Transformer architecture learned over the provided resources and we carried out synthetic data generation experiments which aim at alleviating the data scarcity problem. Results indicate the suitability of the data augmentation experiments, enabling our systems to rank first according to automatic evaluations.

pdf bib
UCSYNLP-Lab Machine Translation Systems for WAT 2019
Yimon ShweSin | Win Pa Pa | KhinMar Soe

This paper describes the UCSYNLP-Lab submission to WAT 2019 for Myanmar-English translation tasks in both direction. We have used the neural machine translation systems with attention model and utilized the UCSY-corpus and ALT corpus. In NMT with attention model, we use the word segmentation level as well as syllable segmentation level. Especially, we made the UCSY-corpus to be cleaned in WAT 2019. Therefore, the UCSY corpus for WAT 2019 is not identical to those used in WAT 2018. Experiments show that the translation systems can produce the substantial improvements.

pdf bib
Sentiment Aware Neural Machine Translation
Chenglei Si | Kui Wu | Ai Ti Aw | Min-Yen Kan

Sentiment ambiguous lexicons refer to words where their polarity depends strongly on con- text. As such, when the context is absent, their translations or their embedded sentence ends up (incorrectly) being dependent on the training data. While neural machine translation (NMT) has achieved great progress in recent years, most systems aim to produce one single correct translation for a given source sentence. We investigate the translation variation in two sentiment scenarios. We perform experiments to study the preservation of sentiment during translation with three different methods that we propose. We conducted tests with both sentiment and non-sentiment bearing contexts to examine the effectiveness of our methods. We show that NMT can generate both positive- and negative-valent translations of a source sentence, based on a given input sentiment label. Empirical evaluations show that our valence-sensitive embedding (VSE) method significantly outperforms a sequence-to-sequence (seq2seq) baseline, both in terms of BLEU score and ambiguous word translation accuracy in test, given non-sentiment bearing contexts.

pdf bib
Overcoming the Rare Word Problem for low-resource language pairs in Neural Machine Translation
Thi-Vinh Ngo | Thanh-Le Ha | Phuong-Thai Nguyen | Le-Minh Nguyen

Among the six challenges of neural machine translation (NMT) coined by (Koehn and Knowles, 2017), rare-word problem is considered the most severe one, especially in translation of low-resource languages. In this paper, we propose three solutions to address the rare words in neural machine translation systems. First, we enhance source context to predict the target words by connecting directly the source embeddings to the output of the attention component in NMT. Second, we propose an algorithm to learn morphology of unknown words for English in supervised way in order to minimize the adverse effect of rare-word problem. Finally, we exploit synonymous relation from the WordNet to overcome out-of-vocabulary (OOV) problem of NMT. We evaluate our approaches on two low-resource language pairs: English-Vietnamese and Japanese-Vietnamese. In our experiments, we have achieved significant improvements of up to roughly +1.0 BLEU points in both language pairs.

pdf bib
Neural Arabic Text Diacritization: State of the Art Results and a Novel Approach for Machine Translation
Ali Fadel | Ibraheem Tuffaha | Bara’ Al-Jawarneh | Mahmoud Al-Ayyoub

In this work, we present several deep learning models for the automatic diacritization of Arabic text. Our models are built using two main approaches, viz. Feed-Forward Neural Network (FFNN) and Recurrent Neural Network (RNN), with several enhancements such as 100-hot encoding, embeddings, Conditional Random Field (CRF) and Block-Normalized Gradient (BNG). The models are tested on the only freely available benchmark dataset and the results show that our models are either better or on par with other models, which require language-dependent post-processing steps, unlike ours. Moreover, we show that diacritics in Arabic can be used to enhance the models of NLP tasks such as Machine Translation (MT) by proposing the Translation over Diacritization (ToD) approach.