WMT 2013

 

8th Workshop

on

Statistical Machine Translation

 

Proceedings of the Workshop

 

August 8-9, 2013

Sofia, Bulgaria

 

 

Table of Contents

Findings of the 2013 Workshop on Statistical Machine Translation

Ondřej Bojar, Christian Buck, Chris Callison-Burch, Christian Federmann, Barry Haddow, Philipp Koehn,

Christof Monz, Matt Post, Radu Soricut and Lucia Specia....................................................................................................................... 1

Results of the WMT13 Metrics Shared Task

Matouš Macháček and Ondřej Bojar.......................................................................................................................................................... 45

The Feasibility of HMEANT as a Human MT Evaluation Metric

Alexandra Birch, Barry Haddow, Ulrich Germann, Maria Nadejde, Christian Buck and Philipp Koehn. . . .. . . . . . . . .  . . . . . . . 52

LIMSI @ WMT13

Alexander Allauzen, Nicolas Pécheux, Quoc Khanh Do, Marco Dinarelli, Thomas Lavergne, Au­rélien Max,

Hai-Son Le and François Yvon ................................................................................................................................................................. 62

The CMU Machine Translation Systems at WMT 2013: Syntax, Synthetic Translation Options, and Pseudo-References

Waleed Ammar, Victor Chahuneau, Michael Denkowski, Greg Hanneman, Wang Ling, Austin Matthews,

Kenton Murray, Nicola Segall, Alon Lavie and Chris Dyer.....................................................................................................................70

Feature Decay Algorithms for Fast Deployment of Accurate Statistical Machine Translation Systems

Ergun Bicici.....................................................................................................................................................................................................78

CUni Multilingual Matrix in the WMT 2013 Shared Task

Karel Bílek and Daniel Zeman................................................................................................................................................................... 85

Chimera – Three Heads for English-to-Czech Translation

Ondřej Bojar, Rudolf Rosa and Aleš Tamchyna..................................................................................................................................... 92

Yandex School of Data Analysis Machine Translation Systems for WMT13

Alexey Borisov,Jacob Dlougach and Irina Galinskaya........................................................................................................................... 99

The Karlsruhe Institute of Technology Translation Systems for the WMT 2013

Eunah Cho, Thanh-Le Ha, Mohammed Mediani, Jan Niehues, Teresa Herrmann, Isabel Slawik

and Alex Waibel ............................................................................................................................................................................................ 104

TÜB˙ITAK-BILGEM German-English Machine Translation Systems for W13 ITAK-B˙

Ilknur Durgar El-Kahlout and Coşkun Mermer ........................................................................................................................................ 109

Edinburgh’s Machine Translation Systems for European Language Pairs

Nadir Durrani, Barry Haddow, Kenneth Heafield and Philipp Koehn................................................................................................... 114

Munich-Edinburgh-Stuttgart Submissions of OSM Systems at WMT13

Nadir Durrani, Alexander Fraser, Helmut Schmid, Hassan Sajjad and Richárd Farkas . . . . . ………………………………… . . . 122

Towards Efficient Large-Scale Feature-Rich Statistical Machine Translation

Vladimir Eidelman, Ke Wu, Ferhan Ture, Philip Resnik and Jimmy Lin................................................................................................. 128

The TALP-UPC Phrase-Based Translation Systems for WMT13: System Combination with Morphology

Generation, Domain Adaptation and Corpus Filtering

Lluís Formiga, Marta R. Costa-jussà, José B. Mariño, José A. R. Fonollosa, Alberto Barrón-Cedeño

and Lluis Marquez ........................................................................................................................................................................................... 134

 

PhraseFix: Statistical Post-Editing of TectoMT

Petra Galuščáková, Martin Popel and Ondřej Bojar....................................................................................................................................141

Feature-Rich Phrase-based Translation: Stanford University’s Submission to the WMT 2013 Translation

Task

Spence Green, Daniel Cer, Kevin Reschke, Rob Voigt, John Bauer, Sida Wang, Natalia Silveira,

Julia Neidert and Christopher D. Manning .................................................................................................................................................. 148

 

Factored Machine Translation Systems for Russian-English

Stéphane Huet, Elena Manishina and Fabrice Lefèvre ........................................................................................................................... 154

Omnifluent English-to-French and Russian-to-English Systems for the 2013 Workshop on Statistical

Ma­chine Translation

Evgeny Matusov and Gregor Leusch.............................................................................................................................................................158

Pre-Reordering for Machine Translation Using Transition-Based Walks on Dependency Parse Trees

Antonio Valerio Miceli Barone and Giuseppe Attardi.................................................................................................................................... 164

Edinburgh’s Syntax-Based Machine Translation Systems

Maria Nadejde, Philip Williams and Philipp Koehn ....................................................................................................................................... 170

Shallow Semantically-Informed PBSMT and HPBSMT

Tsuyoshi Okita, Qun Liu and Josef van Genabith .......................................................................................................................................... 177

Joint WMT 2013 Submission of the QUAERO Project

Stephan Peitz, Saab Mansour, Matthias Huck, Markus Freitag, Hermann Ney, Eunah Cho, Teresa

Herrmann, Mohammed Mediani, Jan Niehues, Alex Waibel, Alexander Allauzen, Quoc Khanh Do,

Bianka Buschbeck and Tonio Wandmacher ..................................................................................................................................................... 185

The RWTH Aachen Machine Translation System for WMT 2013

Stephan Peitz, Saab Mansour, Jan-Thorsten Peter, Christoph Schmidt, Joern Wuebker, Matthias

Huck, Markus Freitag and Hermann Ney ........................................................................................................................................................... 193

The University of Cambridge Russian-English System at WMT13

Juan Pino, Aurelien Waite, Tong Xiao, Adrià de Gispert, Federico Flego and William Byrne . …………………………………………. 200

Joshua 5.0: Sparser, Better, Faster, Server

Matt Post, Juri Ganitkevitch, Luke Orland, Jonathan Weese, Yuan Cao and Chris Callison-Burch……………………………………. 206

 

The CNGL-DCU-Prompsit Translation Systems for WMT13

Raphael Rubino, Antonio Toral, Santiago Cortés Vaíllo, Jun Xie, Xiaofeng Wu, Stephen Doherty

and QunLiu................................................................................................................................................................................................................. 213

QCRI-MES Submission at WMT13: Using Transliteration Mining to Improve Statistical Machine

Trans­lation

Hassan Sajjad, Svetlana Smekalova, Nadir Durrani, Alexander Fraser and Helmut Schmid . . . ………………………………………... 219

Tunable Distortion Limits and Corpus Cleaning for SMT

Sara Stymne, Christian Hardmeier, Jörg Tiedemann and Joakim Nivre . . . . . . . . . . . . . . ………………………………………... . . . . . . . 225

Munich-Edinburgh-Stuttgart Submissions at WMT13: Morphological and Syntactic Processing for SMT

Marion Weller, Max Kisselew, Svetlana Smekalova, Alexander Fraser, Helmut Schmid, Nadir Dur­rani,

Hassan Sajjad and Richárd Farkas......................................................................................................................................................................... 232

 

Coping with the Subjectivity of Human Judgements in MT Quality Estimation

Marco Turchi, Matteo Negri and Marcello Federico.............................................................................................................................................. 240

Online Polylingual Topic Models for Fast Document Translation Detection

Kriste Krstovski and David A. Smith ....................................................................................................................................................................... 252

Combining Bilingual and Comparable Corpora for Low Resource Machine Translation

Ann Irvine and Chris Callison-Burch..................................................................................................................................................................... 262

Generating English Determiners in Phrase-Based Translation with Synthetic Translation Options

Yulia Tsvetkov, Chris Dyer, Lori Levin and Archna Bhatia................................................................................................................................. 271

Dramatically Reducing Training Data Size Through Vocabulary Saturation

William Lewis and Sauleh Eetemadi....................................................................................................................................................................... 281

Multi-Task Learning for Improved Discriminative Training in SMT

Patrick Simianer and Stefan Riezler ....................................................................................................................................................................... 292

Online Learning Approaches in Computer Assisted Translation

Prashant Mathur, Cettolo Mauro and Marcello Federico ..................................................................................................................................... 301

Length-Incremental Phrase Training for SMT

Joern Wuebker and Hermann Ney........................................................................................................................................................................... 309

Positive Diversity Tuning for Machine Translation System Combination

Daniel Cer, Christopher D.Manning and Dan Jurafsky ....................................................................................................................................... 320

Selecting Feature Sets for Comparative and Time-Oriented Quality Estimation of Machine Translation Output

Eleftherios Avramidis and Maja Popovic.................................................................................................................................................................. 329

SHEF-Lite: When Less is More for Translation Quality Estimation

Daniel Beck, Kashif Shah, Trevor Cohn and Lucia Specia ………………………………………………………………................................. 337

Referential Translation Machines for Quality Estimation

Ergun Bicici .................................................................................................................................................................................................................... 343

FBK-UEdin Participation to the WMT13 Quality Estimation Shared Task

José Guilherme Camargo de Souza, Christian Buck, Marco Turchi and Matteo Negri . . …………………………………………... . . . . . . 352

The TALP-UPC Approach to System Selection: Asiya Features and Pairwise Classification Using

Ran­dom Forests

Lluís Formiga, Meritxell Gonzàlez, Alberto Barrón-Cedeño, José A. R. Fonollosa and Lluis Mar­quez ........................................................... 359

Quality Estimation for Machine Translation Using the Joint Method of Evaluation Criteria and

Statisti­cal Modeling

Aaron Li-Feng Han, Yi Lu, Derek F. Wong, Lidia S. Chao, Liangye He and Junwen Xing . . …………………………………………….. . . 365

MT Quality Estimation: The CMU System for WMT’13

Silja Hildebrand and Stephan Vogel ........................................................................................................................................................................... 373

LORIA System for the WMT13 Quality Estimation Shared Task

David Langlois and Kamel Smaili................................................................................................................................................................................. 380

 

LIG System for WMT13 QE Task: Investigating the Usefulness of Features in Word Confidence

Estima­tion for MT

Ngoc Quang Luong, Benjamin Lecouteux and Laurent Besacier.......................................................................................................................... 386

DCU-Symantec at the WMT 2013 Quality Estimation Shared Task

Raphael Rubino, Joachim Wagner, Jennifer Foster, Johann Roturier, Rasoul Samad Zadeh Kaljahi

and Fred Hollowood........................................................................................................................................................................................................ 392

LIMSI Submission for the WMT’13 Quality Estimation Task: an Experiment with N-Gram Posteriors

Anil Kumar Singh, Guillaume Wisniewski and François Yvon .............................................................................................................................. 398

Ranking Translations using Error Analysis and Quality Estimation

Mark Fishel........................................................................................................................................................................................................................ 405

Are ACT’s Scores Increasing with Better Translation Quality?

Najeh Hajlaoui.................................................................................................................................................................................................................. 408

A Description of Tunable Machine Translation Evaluation Systems in WMT13 Metrics Task

Aaron Li-Feng Han, Derek F. Wong, Lidia S. Chao, Yi Lu, Liangye He, Yiming Wang and Jiaji Zhou . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414

MEANT at WMT 2013: A Tunable, Accurate yet Inexpensive Semantic Frame Based MT Evaluation Metric

Chi-kiu Lo and Dekai Wu ................................................................................................................................................................................................ 422

An Approach Using Style Classification Features for Quality Estimation

Erwan Moreau and Raphael Rubino ............................................................................................................................................................................. 429

DCU Participation in WMT2013 Metrics Task

Xiaofeng Wu, Hui Yu and Qun Liu................................................................................................................................................................................. 435

Efficient Solutions for Word Reordering in German-English Phrase-Based Statistical Machine Transla­tion

Arianna Bisazza and Marcello Federico......................................................................................................................................................................... 440

A Phrase Orientation Model for Hierarchical Machine Translation

Matthias Huck, Joern Wuebker, Felix Rietig and Hermann Ney.................................................................................................................................452

A Dependency-Constrained Hierarchical Model with Moses

Yvette Graham...................................................................................................................................................................................................................... 464

Investigations in Exact Inference for Hierarchical Translation

Wilker Aziz, Marc Dymetman and Sriram Venkatapathy............................................................................................................................................... 472

Evaluating (and Improving) Sentence Alignment under Noisy Conditions

Omar Zaidan and Vishal Chowdhary .............................................................................................................................................................................. 484

Multi-Rate HMMs for Word Alignment

Elif Eyigöz, Daniel Gildea and Kemal Oflazer................................................................................................................................................................. 494

Hidden Markov Tree Model for Word Alignment

Shuhei Kondo, Kevin Duh and Yuji Matsumoto............................................................................................................................................................. 503

An MT Error-Driven Discriminative Word Lexicon using Sentence Structure Features

Jan Niehues and Alex Waibel............................................................................................................................................................................................. 512