Maxim Khalilov - ACL Anthology

Maxim Khalilov

2020

A Post-Editing Dataset in the Legal Domain: Do we Underestimate Neural Machine Translation Quality?
Julia Ive | Lucia Specia | Sara Szoc | Tom Vanallemeersch | Joachim Van den Bogaert | Eduardo Farah | Christine Maroti | Artur Ventura | Maxim Khalilov
Proceedings of the Twelfth Language Resources and Evaluation Conference

We introduce a machine translation dataset for three pairs of languages in the legal domain with post-edited high-quality neural machine translation and independent human references. The data was collected as part of the EU APE-QUEST project and comprises crawled content from EU websites with translation from English into three European languages: Dutch, French and Portuguese. Altogether, the data consists of around 31K tuples including a source sentence, the respective machine translation by a neural machine translation system, a post-edited version of such translation by a professional translator, and - where available - the original reference translation crawled from parallel language websites. We describe the data collection process, provide an analysis of the resulting post-edits and benchmark the data using state-of-the-art quality estimation and automatic post-editing models. One interesting by-product of our post-editing analysis suggests that neural systems built with publicly available general domain data can provide high-quality translations, even though comparison to human references suggests that this quality is quite low. This makes our dataset a suitable candidate to test evaluation metrics. The data is freely available as an ELRC-SHARE resource.

2019

APE-QUEST
Joachim Van den Bogaert | Heidi Depraetere | Sara Szoc | Tom Vanallemeersch | Koen Van Winckel | Frederic Everaert | Lucia Specia | Julia Ive | Maxim Khalilov | Christine Maroti | Eduardo Farah | Artur Ventura
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks

2018

Machine translation at Booking.com: what’s next?
Maxim Khalilov
Proceedings of the AMTA 2018 Workshop on Translation Quality Estimation and Automatic Post-Editing

2017

Toward a full-scale neural machine translation in production: the Booking.com use case
Pavel Levin | Nishikant Dhanuka | Talaat Khalil | Fedor Kovalev | Maxim Khalilov
Proceedings of Machine Translation Summit XVI: Commercial MT Users and Translators Track

2016

Evaluation of machine translation quality in e-commerce environment
Maxim Khalilov
Conferences of the Association for Machine Translation in the Americas: MT Users' Track

2014

Machine translation for LSPs: strategy and implementation
Maxim Khalilov
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)

2013

English-to-Russian MT evaluation campaign
Pavel Braslavski | Alexander Beloborodov | Maxim Khalilov | Serge Sharoff
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

Building English-Chinese and Chinese-English MT engines for the computer software domain
Maxim Khalilov | Rahzeb Choudhury
Proceedings of the 16th Annual Conference of the European Association for Machine Translation

2011

Context-Sensitive Syntactic Source-Reordering by Statistical Transduction
Maxim Khalilov | Khalil Sima’an
Proceedings of 5th International Joint Conference on Natural Language Processing

ILLC-UvA translation system for EMNLP-WMT 2011
Maxim Khalilov | Khalil Sima’an
Proceedings of the Sixth Workshop on Statistical Machine Translation

2010

Source reordering using MaxEnt classifiers and supertags
Maxim Khalilov | Khalil Sima’an
Proceedings of the 14th Annual Conference of the European Association for Machine Translation

ILLC-UvA machine translation system for the IWSLT 2010 evaluation
Maxim Khalilov | Khalil Sima’an
Proceedings of the 7th International Workshop on Spoken Language Translation: Evaluation Campaign

Towards Improving English-Latvian Translation: A System Comparison and a New Rescoring Feature
Maxim Khalilov | José A. R. Fonollosa | Inguna Skadin̨a | Edgars Brālītis | Lauma Pretkalnin̨a
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Translation into the languages with relatively free word order has received a lot less attention than translation into fixed word order languages (English), or into analytical languages (Chinese). At the same time this translation task is found among the most difficult challenges for machine translation (MT), and intuitively it seems that there is some space in improvement intending to reflect the free word order structure of the target language. This paper presents a comparative study of two alternative approaches to statistical machine translation (SMT) and their application to a task of English-to-Latvian translation. Furthermore, a novel feature intending to reflect the relatively free word order scheme of the Latvian language is proposed and successfully applied on the n-best list rescoring step. Moving beyond classical automatic scores of translation quality that are classically presented in MT research papers, we contribute presenting a manual error analysis of MT systems output that helps to shed light on advantages and disadvantages of the SMT systems under consideration.

A Discriminative Syntactic Model for Source Permutation via Tree Transduction
Maxim Khalilov | Khalil Sima’an
Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation

2009

A New Subtree-Transfer Approach to Syntax-Based Reordering for Statistical Machine Translation
Maxim Khalilov | José A. R. Fonollosa | Mark Dras
Proceedings of the 13th Annual Conference of the European Association for Machine Translation

N-Gram-Based Statistical Machine Translation versus Syntax Augmented Machine Translation: Comparison and System Combination
Maxim Khalilov | José A. R. Fonollosa
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

The TALP-UPC Phrase-Based Translation System for EACL-WMT 2009
José A. R. Fonollosa | Maxim Khalilov | Marta R. Costa-jussà | José B. Mariño | Carlos A. Henríquez Q. | Adolfo Hernández H. | Rafael E. Banchs
Proceedings of the Fourth Workshop on Statistical Machine Translation

Coupling Hierarchical Word Reordering and Decoding in Phrase-Based Statistical Machine Translation
Maxim Khalilov | José A. R. Fonollosa | Mark Dras
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009

2008

The TALP&I2R SMT systems for IWSLT 2008.
Maxim Khalilov | Marta R. Costa-jussà | Carlos A. Henríquez Q. | José A. R. Fonollosa | Adolfo Hernández H. | José B. Mariño | Rafael E. Banchs | Chen Boxing | Min Zhang | Aiti Aw | Haizhou Li
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper gives a description of the statistical machine translation (SMT) systems developed at the TALP Research Center of the UPC (Universitat Polite`cnica de Catalunya) for our participation in the IWSLT’08 evaluation campaign. We present Ngram-based (TALPtuples) and phrase-based (TALPphrases) SMT systems. The paper explains the 2008 systems’ architecture and outlines translation schemes we have used, mainly focusing on the new techniques that are challenged to improve speech-to-speech translation quality. The novelties we have introduced are: improved reordering method, linear combination of translation and reordering models and new technique dealing with punctuation marks insertion for a phrase-based SMT system. This year we focus on the Arabic-English, Chinese-Spanish and pivot Chinese-(English)-Spanish translation tasks.

The TALP-UPC Ngram-Based Statistical Machine Translation System for ACL-WMT 2008
Maxim Khalilov | Adolfo Hernández H. | Marta R. Costa-jussà | Josep M. Crego | Carlos A. Henríquez Q. | Patrik Lambert | José A. R. Fonollosa | José B. Mariño | Rafael E. Banchs
Proceedings of the Third Workshop on Statistical Machine Translation

2007

The TALP ngram-based SMT system for IWSLT 2007
Patrik Lambert | Marta R. Costa-jussà | Josep M. Crego | Maxim Khalilov | José B. Mariño | Rafael E. Banchs | José A. R. Fonollosa | Holger Schwenk
Proceedings of the Fourth International Workshop on Spoken Language Translation

This paper describes TALPtuples, the 2007 N-gram-based statistical machine translation system developed at the TALP Research Center of the UPC (Universitat Polite`cnica de Catalunya) in Barcelona. Emphasis is put on improvements and extensions of the system of previous years. Mainly, these include optimizing alignment parameters in function of translation metric scores and rescoring with a neural network language model. Results on two translation directions are reported, namely from Arabic and Chinese into English, thoroughly explaining all language-related preprocessing and translation schemes.

Ngram-Based Statistical Machine Translation Enhanced with Multiple Weighted Reordering Hypotheses
Marta R. Costa-jussà | Josep M. Crego | Patrik Lambert | Maxim Khalilov | José A. R. Fonollosa | José B. Mariño | Rafael E. Banchs
Proceedings of the Second Workshop on Statistical Machine Translation

2006

The TALP Ngram-based SMT systems for IWSLT 2006
Josep M. Crego | Adrià de Gispert | Patrick Lambert | Maxim Khalilov | Marta R. Costa-jussà | José B. Mariño | Rafael Banchs | José A. R. Fonollosa
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign

TALP phrase-based system and TALP system combination for IWSLT 2006
Marta R. Costa-jussà | Josep M. Crego | Adrià de Gispert | Patrik Lambert | Maxim Khalilov | José A. R. Fonollosa | José B. Mariño | Rafael Banchs
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign

TALP Phrase-based statistical translation system for European language pairs
Marta R. Costa-jussà | Josep M. Crego | Adrià de Gispert | Patrik Lambert | Maxim Khalilov | José B. Mariño | José A. R. Fonollosa | Rafael Banchs
Proceedings on the Workshop on Statistical Machine Translation

N-gram-based SMT System Enhanced with Reordering Patterns
Josep M. Crego | Adrià de Gispert | Patrik Lambert | Marta R. Costa-jussà | Maxim Khalilov | Rafael Banchs | José B. Mariño | José A. R. Fonollosa
Proceedings on the Workshop on Statistical Machine Translation