Proceedings of the Third Workshop on Discourse in Machine Translation

Proceedings of the Third Workshop on Discourse in Machine Translation Bonnie Webber Andrei Popescu-Belis Jörg Tiedemann September 2017

Copenhagen, Denmark

Association for Computational Linguistics http://aclweb.org/anthology/W17-48 book DiscoMT:2017 Findings of the 2017 DiscoMT Shared Task on Cross-lingual Pronoun Prediction SharidLoáiciga SaraStymne PreslavNakov ChristianHardmeier JörgTiedemann MauroCettolo YannickVersley Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 1–16 http://aclweb.org/anthology/W17-4801 W17-4801.Attachment.zip We describe the design, the setup, and the evaluation results of the DiscoMT 2017 shared task on cross-lingual pronoun prediction. The task asked participants to predict a target-language pronoun given a source-language pronoun in the context of a sentence. We further provided a lemmatized target-language human-authored translation of the source sentence, and automatic word alignments between the source sentence words and the target-language lemmata. The aim of the task was to predict, for each target-language pronoun placeholder, the word that should replace it from a small, closed set of classes, using any type of information that can be extracted from the entire document. We offered four subtasks, each for a different language pair and translation direction: English-to-French, English-to-German, German-to-English, and Spanish-to-English. Five teams participated in the shared task, making submissions for all language pairs. The evaluation results show that most participating teams outperformed two strong n-gram-based language model-based baseline systems by a sizable margin. inproceedings loaiciga-EtAl:2017:DiscoMT Validation of an Automatic Metric for the Accuracy of Pronoun Translation (APT) LeslyMiculicich Werlen AndreiPopescu-Belis Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 17–25 http://aclweb.org/anthology/W17-4802 In this paper, we define and assess a reference-based metric to evaluate the accuracy of pronoun translation (APT). The metric automatically aligns a candidate and a reference translation using GIZA++ augmented with specific heuristics, and then counts the number of identical or different pronouns, with provision for legitimate variations and omitted pronouns. All counts are then combined into one score. The metric is applied to the results of seven systems (including the baseline) that participated in the DiscoMT 2015 shared task on pronoun translation from English to French. The APT metric reaches around 0.993-0.999 Pearson correlation with human judges (depending on the parameters of APT), while other automatic metrics such as BLEU, METEOR, or those specific to pronouns used at DiscoMT 2015 reach only 0.972-0.986 Pearson correlation. inproceedings miculicichwerlen-popescubelis:2017:DiscoMT Using a Graph-based Coherence Model in Document-Level Machine Translation LeoBorn MohsenMesgar MichaelStrube Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 26–35 http://aclweb.org/anthology/W17-4803 Although coherence is an important aspect of any text generation system, it has received little attention in the context of machine translation (MT) so far. We hypothesize that the quality of document-level translation can be improved if MT models take into account the semantic relations among sentences during translation. We integrate the graph-based coherence model proposed by Mesgar and Strube, (2016) with Docent (Hardmeier et al., 2012, Hardmeier, 2014) a document-level machine translation system. The application of this graph-based coherence modeling approach is novel in the context of machine translation. We evaluate the coherence model and its effects on the quality of the machine translation. The result of our experiments shows that our coherence model slightly improves the quality of translation in terms of the average Meteor score. inproceedings born-mesgar-strube:2017:DiscoMT Treatment of Markup in Statistical Machine Translation MathiasMüller Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 36–46 http://aclweb.org/anthology/W17-4804 We present work on handling XML markup in Statistical Machine Translation (SMT). The methods we propose can be used to effectively preserve markup (for instance inline formatting or structure) and to place markup correctly in a machine-translated segment. We evaluate our approaches with parallel data that naturally contains markup or where markup was inserted to create synthetic examples. In our experiments, hybrid reinsertion has proven the most accurate method to handle markup, while alignment masking and alignment reinsertion should be regarded as viable alternatives. We provide implementations of all the methods described and they are freely available as an open-source framework. inproceedings muller:2017:DiscoMT A BiLSTM-based System for Cross-lingual Pronoun Prediction SaraStymne SharidLoáiciga FabienneCap Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 47–53 http://aclweb.org/anthology/W17-4805 We describe the Uppsala system for the 2017 DiscoMT shared task on cross-lingual pronoun prediction. The system is based on a lower layer of BiLSTMs reading the source and target sentences respectively. Classification is based on the BiLSTM representation of the source and target positions for the pronouns. In addition we enrich our system with dependency representations from an external parser and character representations of the source sentence. We show that these additions perform well for German and Spanish as source languages. Our system is competitive and is in first or second place for all language pairs. inproceedings stymne-loaiciga-cap:2017:DiscoMT Neural Machine Translation for Cross-Lingual Pronoun Prediction SébastienJean StanislasLauly OrhanFirat KyunghyunCho Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 54–57 http://aclweb.org/anthology/W17-4806 In this paper we present our systems for the DiscoMT 2017 cross-lingual pronoun prediction shared task. For all four language pairs, we trained a standard attention-based neural machine translation system as well as three variants that incorporate information from the preceding source sentence. We show that our systems, which are not specifically designed for pronoun prediction and may be used to generate complete sentence translations, generally achieve competitive results on this task. inproceedings jean-EtAl:2017:DiscoMT Predicting Pronouns with a Convolutional Network and an N-gram Model ChristianHardmeier Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 58–62 http://aclweb.org/anthology/W17-4807 This paper describes the UU-Hardmeier system submitted to the DiscoMT 2017 shared task on cross-lingual pronoun prediction. The system is an ensemble of convolutional neural networks combined with a source-aware n-gram language model. inproceedings hardmeier:2017:DiscoMT Cross-Lingual Pronoun Prediction with Deep Recurrent Neural Networks v2.0 JuhaniLuotolahti JennaKanerva FilipGinter Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 63–66 http://aclweb.org/anthology/W17-4808 In this paper we present our system in the DiscoMT 2017 Shared Task on Crosslingual Pronoun Prediction. Our entry builds on our last year’s success, our system based on deep recurrent neural networks outperformed all the other systems with a clear margin. This year we investigate whether different pre-trained word embeddings can be used to improve the neural systems, and whether the recently published Gated Convolutions outperform the Gated Recurrent Units used last year. inproceedings luotolahti-kanerva-ginter:2017:DiscoMT Combining the output of two coreference resolution systems for two source languages to improve annotation projection YuliaGrishina Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 67–72 http://aclweb.org/anthology/W17-4809 Although parallel coreference corpora can to a high degree support the development of SMT systems, there are no large-scale parallel datasets available due to the complexity of the annotation task and the variability in annotation schemes. In this study, we exploit an annotation projection method to combine the output of two coreference resolution systems for two different source languages (English, German) in order to create an annotated corpus for a third language (Russian). We show that our technique is superior to projecting annotations from a single source language, and we provide an in-depth analysis of the projected annotations in order to assess the perspectives of our approach. inproceedings grishina:2017:DiscoMT Discovery of Discourse-Related Language Contrasts through Alignment Discrepancies in English-German Translation EkaterinaLapshinova-Koltunski ChristianHardmeier Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 73–81 http://aclweb.org/anthology/W17-4810 In this paper, we analyse alignment discrepancies for discourse structures in English-German parallel data – sentence pairs, in which discourse structures in target or source texts have no alignment in the corresponding parallel sentences. The discourse-related structures are designed in form of linguistic patterns based on the information delivered by automatic part-of-speech and dependency annotation. In addition to alignment errors (existing structures left unaligned), these alignment discrepancies can be caused by language contrasts or through the phenomena of explicitation and implicitation in the translation process. We propose a new approach including new type of resources for corpus-based language contrast analysis and apply it to study and classify the contrasts found in our English-German parallel corpus. As unaligned discourse structures may also result in the loss of discourse information in the MT training data, we hope to deliver information in support of discourse-aware machine translation (MT). inproceedings lapshinovakoltunski-hardmeier:2017:DiscoMT Neural Machine Translation with Extended Context JörgTiedemann YvesScherrer Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 82–92 http://aclweb.org/anthology/W17-4811 We investigate the use of extended context in attention-based neural machine translation. We base our experiments on translated movie subtitles and discuss the effect of increasing the segments beyond single translation units. We study the use of extended source language context as well as bilingual context extensions. The models learn to distinguish between information from different segments and are surprisingly robust with respect to translation quality. In this pilot study, we observe interesting cross-sentential attention patterns that improve textual coherence in translation at least in some selected cases. inproceedings tiedemann-scherrer:2017:DiscoMT Translating Implicit Discourse Connectives Based on Cross-lingual Annotation and Alignment HongzhengLi PhilippeLanglais YaohongJin Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 93–98 http://aclweb.org/anthology/W17-4812 Implicit discourse connectives and relations are distributed more widely in Chinese texts, when translating into English, such connectives are usually translated explicitly. Towards Chinese-English MT, in this paper we describe cross-lingual annotation and alignment of dis-course connectives in a parallel corpus, describing related surveys and findings. We then conduct some evaluation experiments to testify the translation of implicit connectives and whether representing implicit connectives explicitly in source language can improve the final translation performance significantly. Preliminary results show it has little improvement by just inserting explicit connectives for implicit relations. inproceedings li-langlais-jin:2017:DiscoMT Lexical Chains meet Word Embeddings in Document-level Statistical Machine Translation LauraMascarell Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 99–109 http://aclweb.org/anthology/W17-4813 Currently under review for EMNLP 2017 The phrase-based Statistical Machine Translation (SMT) approach deals with sentences in isolation, making it difficult to consider discourse context in translation. This poses a challenge for ambiguous words that need discourse knowledge to be correctly translated. We propose a method that benefits from the semantic similarity in lexical chains to improve SMT output by integrating it in a document-level decoder. We focus on word embeddings to deal with the lexical chains, contrary to the traditional approach that uses lexical resources. Experimental results on German-to-English show that our method produces correct translations in up to 88% of the changes, improving the translation in 36%-48% of them over the baseline. inproceedings mascarell:2017:DiscoMT On Integrating Discourse in Machine Translation KarinSim Smith Proceedings of the Third Workshop on Discourse in Machine Translation September 2017

Copenhagen, Denmark

Association for Computational Linguistics 110–121 http://aclweb.org/anthology/W17-4814 As the quality of Machine Translation (MT) improves, research on improving discourse in automatic translations becomes more viable. This has resulted in an increase in the amount of work on discourse in MT. However many of the existing models and metrics have yet to integrate these insights. Part of this is due to the evaluation methodology, based as it is largely on matching to a single reference. At a time when MT is increasingly being used in a pipeline for other tasks, the semantic element of the translation process needs to be properly integrated into the task. Moreover, in order to take MT to another level, it will need to judge output not based on a single reference translation, but based on notions of fluency and of adequacy – ideally with reference to the source text. inproceedings simsmith:2017:DiscoMT