Machine Translation
Summit XIII
Proceedings
of the 13th Machine Translation
Organized by:
Asia-Pacific Association for
Machine Translation
Supported by:
Chinese Information Processing
Society of
September
19-23, 2011,
Table of Contents
Message from the President
of International Association for Machine Translation:
Professor Hitoshi Isahara ...................................................................................................
ii
Message from the Program
Committee Chair:
Dr. Hiromi Nakaiwa......................................................................................................... iii
Tutorials
...........................................................................................................................
12
Keynote: Professor Zhendong Dong
................................................................................
18
Invited Talk 1: MT
everywhere: Next Steps
Dr. Mike Dillinger
............................................................................................................
20
Invited Talk 2: Strategic MT Research in
Professor Hans Uszkoreit.................................................................................................
21
Special Session on Patent Translation
Introductory Talk: Challenges of Patent MT -- Term and
Structure Translation
Professor Jun'ichi Tsujii....................................................................................................
22
Invited Talk: MT for Patent Search at KIPO
Mr. YooChan Choi
...........................................................................................................
23
Invited Talk: COPPA, CLIR
and TAPTA: three tools to assist in overcoming the Patent language barrier at
WIPO
Mr. Bruno Pouliquen.........................................................................................................
24
A1 Research Papers – Training (1)
A1-1 Methods for Smoothing
the Optimizer Instability in SMT
Mauro Cettolo, Nicola Bertoldi and Marcello Federico ...................................................
32
Aaron Phillips and Ralf
Brown..........................................................................................
40
A1-3 Maximum Rank
Correlation Training for Statistical Machine Translation
Daqi Zheng, Yifan
He, Yang Liu and Qun
Liu.................................................................. 48
B1 Research Papers – Pre-processing for MT
B1-1 POS Tagging of English
Particles for Machine Translation
Jianjun Ma, Degen Huang, Haixia Liu and Wenfeng Sheng ............................................. 57
B1-2 Multi-stage Chinese
Dependency Parsing Based on Dependency Direction
Wenjing Lang, Qiaoli Zhou, Guiping Zhang and Dongfeng Cai....................................... 64
B1-3 Statistic Machine
Translation Boosted with Spurious Word Deletion
Shujie Liu, Chi-Ho Li and Ming Zhou
..............................................................................
72
C1 Research Papers – Speech Translation
C1-1 Phonetic
Representation-Based Speech Translation
Jie Jiang, Zeeshan Ahmed, Julie Carson-Berndsen, Peter Cahill and
C1-2 Unsupervised
Vocabulary Selection for Domain-Independent Simultaneous Lecture
Paul Maergner,
C1-3 Context-aware Language Modeling for Conversational Speech Translation
Avneesh Saluja,
A2 Research Papers – Training (2)
A2-1 Incremental Training and
Intentional Over-fitting of Word Alignment
Qin Gao, Will Lewis, Chris Quirk
and Mei-Yuh Hwang
............................................... 106
A2-2 Alignment Inference and
Bayesian Adaptation for Machine Translation
Kevin Duh, Katsuhito Sudoh, Tomoharu Iwata and Hajime Tsukada............................ 114
A2-3 Multi-Strategy
Approaches to Active Learning for Statistical Machine Translation
Vamshi Ambati, Stephan Vogel and Jaime Carbonell ...................................................
122
B2 Research Papers – Technologies Supporting MT
B2-1 Document-level Consistency
Verification in Machine Translation
Tong Xiao, Jingbo Zhu, Shujie Yao and Hao Zhang .....................................................
131
B2-2 Function Word Generation in
Statistical Machine Translation Systems
Lei Cui, Dongdong Zhang, Mu Li
and Ming Zhou ....................................................... 139
B2-3 Multimodal
Building of Monolingual Dictionaries for Machine Translation by Non-Expert
Users
Miquel Esplà-Gomis, Víctor
M. Sánchez-Cartagena and Juan Antonio Pérez-Ortiz.... 147
C2 Research Papers – Computer Assisted Translation
Hirokazu
Suzuki.............................................................................................................
156
C2-2 Qualitative Analysis of
Post-Editing for High Quality Machine Translation
Frédéric Blain, Jean Senellart, Holger Schwenk, Mirko Plitt and Johann Roturier...... 164
C2-3 Using machine
translation in computer-aided translation to suggest the target-side
Miquel Esplà-Gomis, Felipe Sánchez-Martínez and Mikel L. Forcada ....................... 172
A3 Research Papers – Model (1)
A3-1 A Unified SMT Framework
Combining MIRA and MERT
Shujie Liu, Chi-Ho Li and Ming Zhou
......................................................................... 181
A3-2 Improving Phrase
Extraction via MBR Phrase Scoring and Pruning
A3-3 Phrase Segmentation Model
using Collocation and Translational Entropy
Hyoung-Gyu Lee, Joo-Young Lee, Min-Jeong Kim, Hae-Chang Rim, Joong-Hwi
Shin and Young-Sook Hwang
......................................................................................
198
B3 Research Papers – MT Based on Linguistic Knowledge
B3-1 Singular
or Plural? Exploiting Parallel
Corpora for Chinese Number Prediction
Elizabeth Baran and Nianwen Xue................................................................................
207
B3-2 Handling Multiword
Expressions in Phrase-Based Statistical Machine Translation
Santanu Pal, Tanmoy Chakraborty
and Sivaji Bandyopadhyay
................................... 215
VI
B3-3 Automatic Error
Analysis for Morphologically Rich Languages
Ahmed El Kholy and Nizar Habash
..............................................................................
225
C3 User’s Studies (1)
C3-1 MT use within the
enterprise: Encouraging adoption via a unified MT API
Raymond Flournoy
.......................................................................................................
234
C3-2 Deploying MT into a
Localisation Workflow: Pains and Gains
Yanli Sun, Juan Liu and Yi
Li.......................................................................................
239
C3-3 Evaluation of MT
Systems to Translate User Generated Content
Johann Roturier and Anthony Bensadoun.....................................................................
244
A4 Research Papers – Model (2)
A4-1 A
Unified and Discriminative Soft Syntactic Constraint Model for Hierarchical
Lemao Liu, Tiejun Zhao, Chao Wang and Hailong Cao ...............................................
253
A4-2 Simple but Effective
Approaches to Improving Tree-to-tree Model
Feifei Zhai, Jiajun
Zhang, Yu Zhou and Chengqing Zong............................................
261
A4-3 Unpacking and
Transforming Feature Functions: New Ways to Smooth Phrase Tables
Boxing Chen, Roland Kuhn, George Foster and Howard Johnson
............................... 269
B4 Research Papers – Domain Adaptation
B4-1 Identification and
Translation of Significant Patterns for Cross-Domain SMT Applications
Han-Bin Chen, Hen-Hsen Huang, Jengwei Tjiu, Ching-Ting
Tan and Hsin-Hsi Chen .. 277
B4-2 Domain Adaptation in
Statistical Machine Translation of User-Forum Data using
Component Level Mixture
Modelling
Pratyush Banerjee, Sudip
Kumar Naskar, Johann Roturier,
Genabith
...........................................................................................................................
285
B4-3 Bagging-based System Combination
for Domain Adaption
Linfeng Song, Haitao Mi, Yajuan Lü and Qun
Liu.......................................................... 293
C4 Research Papers – Multi-path Translation
C4-1 Extracting
Pre-ordering Rules from Chunk-based Dependency Trees for
Japanese-to-English Translation
Xianchao Wu, Katsuhito Sudoh, Kevin Duh, Hajime Tsukada
and Masaaki Nagata..... 300
C4-2 Statistical
Post-Editing for a Statistical MT System
Hanna Bechara, Yanjun Ma and Josef van Genabith
...................................................... 308
C4-3 Post-ordering in
Statistical Machine Translation
Katsuhito Sudoh, Xianchao
Wu, Kevin Duh, Hajime Tsukada and Masaaki
Nagata...... 316
P1A Research Papers
P1A-1 Searching Translation
Memories for Paraphrases
Masao Utiyama, Graham Neubig, Takashi Onishi and Eiichiro Sumita
......................... 325
P1A-2 Are numbers good enough
for you? - A linguistically meaningful MT evaluation method
Takako Aikawa and Spencer Rarrick...............................................................................
332
P1A-3 Marker-based Chunking
for Analogy-based Translation of Chunks
P1A-4 A Comparison of
Unsupervised Bilingual Term Extraction Methods Using Phrase-Tables
Masamichi Ideue, Kazuhide
Yamamoto, Masao Utiyama and Eiichiro
Sumita ............. 346
Jeff Ma, Spyros Matsoukas and
Richard Schwartz .......................................................... 352
P1A-6 Multi-granularity Word
Alignment and Decoding for Agglutinative Language Translation
Zhiyang Wang, Yajuan Lü
and Qun Liu
........................................................................... 360
P1C System Presentations
P1C-1 Word Alignment Using
GIZA++ on Windows
Liang Tian, Fai Wong and Sam
Chao................................................................................
369
P1C-2 ENGtube:
an Integrated Subtitle Environment for ESL
Chi-Ho Li, Shujie Liu, Chenguang Wang and Ming
Zhou................................................ 373
P1C-3 Broadcast news
speech-to-text translation experiments
Sylvain Raybaud, David Langlois and Kamel Smaïli........................................................
378
A5 Research Papers – Model (3)
A5-1 Improving the Hierarchical
Phrase-Based Translation Model
Xiaodong Shi, Xiang Zhu and Yidong Chen
.................................................................... 383
A5-2 Lexical-based
Reordering Model for Hierarchical Phrase-based Machine Translation
Zhongguang Zheng,
A5-3 Effective Use of
Discontinuous Phrases for Hierarchical Phrase-based Translation
Wei Wei and Bo Xu...........................................................................................................
397
B5 Research Papers – Corpora
B5-1 Generating Virtual Parallel
Corpus: A Compatibility Centric Method
Jia Xu and Weiwei Sun
.....................................................................................................
406
B5-2 Parallel Corpus
Refinement as an Outlier Detection Algorithm
Kaveh Taghipour, Shahram
Khadivi and Jia Xu ...............................................................
414
B5-3 MT Detection in
Web-Scraped Parallel Corpora
Spencer Rarrick, Chris Quirk and
Will Lewis ..................................................................
422
C5 Research Papers – Grammatical Theory for MT
C5-1 On
the Expressivity of Linear Transductions
Markus Saers, Dekai
Wu and Chris Quirk
........................................................................ 431
C5-2 Handheld Machine
Translation System Based on Constraint Synchronous Grammar
Fai Wong, Francisco Oliveira, Sam Chao and Chi-Wai Tang .......................................... 439
P2A Research Papers
P2A-1 A
Comparison Study of Parsers for Patent Machine Translation
Isao Goto, Masao Utiyama, Takashi Onishi and Eiichiro Sumita
.................................... 448
P2A-2 Rich Linguistic Features
for Translation Memory-Inspired Consistent Translation
Yifan He, Yanjun Ma,
P2A-3 Japanese-Chinese Phrase
Alignment Using Common Chinese Characters Information
Chenhui Chu, Toshiaki Nakazawa and Sadao Kurohashi
................................................. 464
P2A-4 The
Cultivation of a Chinese-English-Japanese Trilingual Parallel Corpus from
Bin Lu, Ka Po Chow and Benjamin K. Tsou
.................................................................... 472
P2A-5 Evaluation Methodology
and Results for English-to-Arabic MT
Olivier Hamon and Khalid Choukri
..................................................................................
480
P2A-6 Example-Based Machine
Translation for Low-Resource Language Using Chunk-String Templates
P2A-7 Improve SMT with
Source-Side “Topic-Document” Distributions
Zhengxian Gong, Guodong Zhou and Liangyou Li
.......................................................... 496
P2C System Presentations
P2C-1 AIR-based light
clients for supporting Moses engine training
Jeffrey Rueppel, Li Jiang, Gong
Yu and Ray Flournoy
.................................................... 503
P2C-2 LetsMT!: Cloud-Based Platform for Building User Tailored Machine
Translation Engines
Andrejs Vasiljevs, Raivis
Skadinš and Jörg Tiedemann
................................................... 507
A6 Research Papers – Evaluation
A6-1 Predicting Machine
Translation Adequacy
Lucia Specia, Najeh
Hajlaoui, Catalina Hallett
and Wilker Aziz......................................
513
A6-2 Getting Expert
Quality from the Crowd for Machine Translation Evaluation
Luisa Bentivogli, Marcello
Federico, Giovanni Moretti and Michael
Paul....................... 521
A6-3 A
Framework for Diagnostic Evaluation of MT Based on Linguistic Checkpoints
Sudip Kumar Naskar, Antonio Toral, Federico Gaspari and
A6-4 Comparative Evaluation
of Term Informativeness Measures in Machine
Translation
Billy Wong and Chunyu
Kit...............................................................................................
537
B6 Research Papers – System Combination
B6-1 System Combination for
Machine Translation Based on Text-to-Text Generation
Wei-Yun Ma and Kathleen Mckeown...............................................................................
546
B6-2 Hybrid Machine
Translation Guided by a Rule–Based System
Cristina España-Bonet, Gorka Labaka, Arantza
Díaz de Ilarraza, Lluís Màrquez and
Kepa Sarasola....................................................................................................................
554
B6-3 Integrating
shallow-transfer rules into phrase-based statistical machine translation
Víctor M. Sánchez-Cartagena, Felipe Sánchez-Martínez and Juan Antonio Pérez-Ortiz
.562
B6-4 Hypergraph
Training and Decoding of System Combination in SMT
Yupeng Liu, Tiejun Zhao and Sheng Li.............................................................................
570
C6 User’s Studies (2)
Na Ye and Guiping Zhang................................................................................................
579
C6-2 UTX 1.11, a Simple and
Open User Dictionary/Terminology Standard, and its
Effectiveness with Multiple MT
Systems
Seiji Okura, Yuji Yamamoto, Hajime Ito, Michael Kato, Miwako Shimazu and
Francis
Bond.....................................................................................................................
587
C6-3 Real-time Multi-media
Translation for Healthcare: a Usability Study
Mark Seligman and Mike
Dillinger...................................................................................
595