Learning Machine Translation

Learning Machine Translation

Cyril Goutte

Nicola Cancedda

Marc Dymetman

George Foster

The MIT Press

Cambridge, Massachusetts

London, England

Contents

Series Foreword ...................................................................................................... ix

Preface............................................................................................................................. xi

1 A Statistical Machine Translation Primer.............................................................. 1

Nicola Cancedda, Marc Dymetman, George Foster, and Cyril Goutte

1.1 Background.............................................................................................................. 1

1.2 Evaluation of Machine Translation..................................................................... 3

1.3 Word-Based MT...................................................................................................... 8

1.4 Language Models.................................................................................................... 11

1.5 Phrase-Based MT .......................................................................................... 18

1.6 Syntax-Based SMT................................................................................................. 26

1.7 Some Other Important Directions.............................................................. 30

1.8 Machine Learning for SMT ............................................................................... 32

1.9 Conclusion ............................................................................................................ 36

1 Enabling Technologies 39

2 Mining Patents for Parallel Corpora............................................................................... 41

Masao Utiyama and Hitoshi Isahara

2.1 Introduction.......................................................................................................... 41

2.2 Related Work........................................................................................................... 42

2.3 Resources.................................................................................................................. 44

2.4 Alignment Procedure............................................................................................... 44

2.5 Statistics of the Patent Parallel Corpus.................................................. 48

2.6 MT Experiments................................................................................................. 51

2.7 Conclusion ............................................................................................................ 56

3 Automatic Construction of Multilingual Name Dictionaries . . 59

Bruno Pouliquen and Ralf Steinberger

3.1 Introduction and Motivation....................................................................... 59

3.2 Related Work........................................................................................................... 65

3.3 Multilingual Recognition of New Names............................................................ 68

3.4 Lookup of Known Names and Their Morphological Variants....................... 70

3.5 Evaluation of Person Name Recognition…………………………………… 72

3.6 Identification and Merging of Name Variants …………………………… 74

3.7 Conclusion and Future Work ……………………………………………… 78

4... Named Entity Transliteration and Discovery in Multilingual Cor
pora 79

Alexandre Klementiev and Dan Roth

4.1 Introduction …………………………………………………………79

4.2 Previous Work ………………………………………………………….82

4.3 Co-Ranking: An Algorithm for NE Discovery………………………………..83

4.4 Experimental Study …………………………………………………………86

4.5 Conclusions …………………………………………………………91

4.6 Future Work …………………………………………………………92

5... Combination of Statistical Word Alignments Based on Multiple
Preprocessing Schemes …………………………………………………………..93

Jakob Elming, Nizar Habash, and Josep M. Crego

5.1 Introduction …………………………………………………………93

5.2 Related Work …………………………………………………………94

5.3 Arabic Preprocessing Schemes ……………………………………………….95

5.4 Preprocessing Schemes for Alignment ……………………………………….96

5.5 Alignment Combination …………………………………………………….. 97

5.6 Evaluation …………………………………………………………99

5.7 Postface: Machine Translation and Alignment Improvements …………..107

5.8 Conclusion ………………………………………………………..110

6 Linguistically Enriched Word-Sequence Kernels for Discriminative
Language Modeling …………………………………………………….. ….. 111

Pierre Mahé and Nicola Cancedda

6.1 Motivations 111

6.2 Linguistically Enriched Word-Sequence Kernels ……………………………113

6.3 Experimental Validation ……………………………………………………..119

6.4 Conclusion and Future Work …………………………………………………125

II Machine Translation ……………………………………………………….. 129

7 Toward Purely Discriminative Training for Tree-Structured Trans
lation Models ……………………………………………………………. 131

Benjamin Wellington, Joseph Turian, and I. Dan Melamed

7.1 Introduction …………………………………………………………131

7.2 Related Work …………………………………………………………132

7.3 Learning Method …………………………………………………………134

7.4 Experiments …………………………………………………………140

7.5 Conclusion …………………………………………………………148

8 Reranking for Large-Scale Statistical Machine Translation ………………...151

Kenji Yamada and Ion Muslea

8.1 Introduction ………………………………………………………….151

8.2 Background ………………………………………………………….152

8.3 Related Work ………………………………………………………….153

8.4 Our Approach ………………………………………………………….154

8.5 Experiment 1: Reranking for the Chinese-to-English System ……………...156

8.6 Experiment 2: Reranking for the French-to-English System ………………..161

8.7 Discussion …………………………………………………………..165

8.8 Conclusion …………………………………………………………..165

9 Kernel-Based Machine Translation ………………………………………………169

Zhuoran Wang and John Shawe-Taylor

9.1 Introduction ..…………………………………………………………169

9.2 Regression Modeling for SMT ………………………………………………..171

9.3 Decoding …………………………………………………………...175

9.4 Experiments ……………………………………………………………177

9.5 Further Discussions ……………………………………………………….........182

9.6 Conclusion ……………………………………………………………183

10.. Statistical Machine Translation through Global Lexical Selection
and Sentence Reconstruction ………………………………………………….. 185

Srinivas Bangalore, Stephan Kanthak, and Patrick Haffner

10.1 Introduction ……………………………………………………… 185

10.2 SFST Training and Decoding ……………………………………………….187

10.3 Discriminant Models for Lexical Selection …………………………………193

10.4 Choosing the Classifier ……………………………………………………….195

10.5 Data and Experiments …………………………………………………..198

10.6 Discussion ………………………………………………………….201

10.7 Conclusions ………………………………………………………….202

11 Discriminative Phrase Selection for SMT ………………………………………205

Jesús Giménez and Lluís Mŕrquez

11.1 Introduction ………………………………………………………….205

11.2 Approaches to Dedicated Word Selection ………………………………….207

11.3 Discriminative Phrase Translation ………………………………………,,.209

11.4 Local Phrase Translation …………………………………………212

11.5 Exploiting Local DPT Models for the Global Task ………………218

11.6 Conclusions ………………………………………………………….234

12 Semisupervised Learning for Machine Translation …………………………….237

Nicola Ueffing, Gholamreza Haffari, and Anoop Sarkar

12.1 Introduction …………………………………………………………….237

12.2 Baseline MT System …………………………………………………………..238

12.3 The Framework …………………………………………………………….240

12.4 Experimental Results …………………………………………………………..245

12.5 Previous Work …………………………………………………………….253

12.6 Conclusion and Outlook ……………………………………………………….255

13 Learning to Combine Machine Translation Systems …………………………….257

Evgeny Matusov, Gregor Leusch, and Hermann Ney

13.1 Introduction …………………………………………………………….257

13.2 Word Alignment …………………………………………………………….260

13.3 Confusion Network Generation and Scoring ………………………………….266

13.4 Experiments …………………………………………………………….272

13.5 Conclusion …………………………………………………………….276

References …………………………………………………………………………...277

Contributors ……………………………………………………………………….307

Index ………………………………………………………………………………….313