Nicola Ueffing


2022

pdf bib
Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5)
Shervin Malmasi | Oleg Rokhlenko | Nicola Ueffing | Ido Guy | Eugene Agichtein | Surya Kallumadi
Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5)

2021

pdf bib
Proceedings of the 4th Workshop on e-Commerce and NLP
Shervin Malmasi | Surya Kallumadi | Nicola Ueffing | Oleg Rokhlenko | Eugene Agichtein | Ido Guy
Proceedings of the 4th Workshop on e-Commerce and NLP

2020

pdf bib
Proceedings of the 3rd Workshop on e-Commerce and NLP
Shervin Malmasi | Surya Kallumadi | Nicola Ueffing | Oleg Rokhlenko | Eugene Agichtein | Ido Guy
Proceedings of the 3rd Workshop on e-Commerce and NLP

2018

pdf bib
Tutorial: Corpora Quality Management for MT - Practices and Roles
Silvio Picinini | Pete Smith | Nicola Ueffing
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 2: User Track)

pdf bib
Automatic Post-Editing and Machine Translation Quality Estimation at eBay
Nicola Ueffing
Proceedings of the AMTA 2018 Workshop on Translation Quality Estimation and Automatic Post-Editing

pdf bib
Quality Estimation for Automatically Generated Titles of eCommerce Browse Pages
Nicola Ueffing | José G. C. de Souza | Gregor Leusch
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

At eBay, we are automatically generating a large amount of natural language titles for eCommerce browse pages using machine translation (MT) technology. While automatic approaches can generate millions of titles very fast, they are prone to errors. We therefore develop quality estimation (QE) methods which can automatically detect titles with low quality in order to prevent them from going live. In this paper, we present different approaches: The first one is a Random Forest (RF) model that explores hand-crafted, robust features, which are a mix of established features commonly used in Machine Translation Quality Estimation (MTQE) and new features developed specifically for our task. The second model is based on Siamese Networks (SNs) which embed the metadata input sequence and the generated title in the same space and do not require hand-crafted features at all. We thoroughly evaluate and compare those approaches on in-house data. While the RF models are competitive for scenarios with smaller amounts of training data and somewhat more robust, they are clearly outperformed by the SN models when the amount of training data is larger.

pdf bib
Multi-lingual neural title generation for e-Commerce browse pages
Prashant Mathur | Nicola Ueffing | Gregor Leusch
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

To provide better access of the inventory to buyers and better search engine optimization, e-Commerce websites are automatically generating millions of browse pages. A browse page consists of a set of slot name/value pairs within a given category, grouping multiple items which share some characteristics. These browse pages require a title describing the content of the page. Since the number of browse pages are huge, manual creation of these titles is infeasible. Previous statistical and neural approaches depend heavily on the availability of large amounts of data in a language. In this research, we apply sequence-to-sequence models to generate titles for high-resource as well as low-resource languages by leveraging transfer learning. We train these models on multi-lingual data, thereby creating one joint model which can generate titles in various different languages. Performance of the title generation system is evaluated on three different languages; English, German, and French, with a particular focus on low-resourced French language.

2017

pdf bib
Generating titles for millions of browse pages on an e-Commerce site
Prashant Mathur | Nicola Ueffing | Gregor Leusch
Proceedings of the 10th International Conference on Natural Language Generation

We present two approaches to generate titles for browse pages in five different languages, namely English, German, French, Italian and Spanish. These browse pages are structured search pages in an e-commerce domain. We first present a rule-based approach to generate these browse page titles. In addition, we also present a hybrid approach which uses a phrase-based statistical machine translation engine on top of the rule-based system to assemble the best title. For the two languages English and German we have access to a large amount of already available rule-based generated and curated titles. For these languages we present an automatic post-editing approach which learns how to post-edit the rule-based titles into curated titles.

pdf bib
A detailed investigation of Bias Errors in Post-editing of MT output
Silvio Picinini | Nicola Ueffing
Proceedings of Machine Translation Summit XVI: Commercial MT Users and Translators Track

2008

pdf bib
Tighter Integration of Rule-Based and Statistical MT in Serial System Combination
Nicola Ueffing | Jens Stephan | Evgeny Matusov | Loïc Dugast | George Foster | Roland Kuhn | Jean Senellart | Jin Yang
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2007

pdf bib
Transductive learning for statistical machine translation
Nicola Ueffing | Gholamreza Haffari | Anoop Sarkar
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
NRC‘s PORTAGE System for WMT 2007
Nicola Ueffing | Michel Simard | Samuel Larkin | Howard Johnson
Proceedings of the Second Workshop on Statistical Machine Translation

pdf bib
Rule-Based Translation with Statistical Phrase-Based Post-Editing
Michel Simard | Nicola Ueffing | Pierre Isabelle | Roland Kuhn
Proceedings of the Second Workshop on Statistical Machine Translation

pdf bib
Word-Level Confidence Estimation for Machine Translation
Nicola Ueffing | Hermann Ney
Computational Linguistics, Volume 33, Number 1, March 2007

2006

pdf bib
Using monolingual source-language data to improve MT performance
Nicola Ueffing
Proceedings of the Third International Workshop on Spoken Language Translation: Papers

pdf bib
Computing Consensus Translation for Multiple Machine Translation Systems Using Enhanced Hypothesis Alignment
Evgeny Matusov | Nicola Ueffing | Hermann Ney
11th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
CDER: Efficient MT Evaluation Using Block Movements
Gregor Leusch | Nicola Ueffing | Hermann Ney
11th Conference of the European Chapter of the Association for Computational Linguistics

2005

pdf bib
Preprocessing and Normalization for Automatic Evaluation of Machine Translation
Gregor Leusch | Nicola Ueffing | David Vilar | Hermann Ney
Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization

pdf bib
Application of word-level confidence measures in interactive statistical machine translation
Nicola Ueffing | Hermann Ney
Proceedings of the 10th EAMT Conference: Practical applications of machine translation

pdf bib
Word-Level Confidence Estimation for Machine Translation using Phrase-Based Translation Models
Nicola Ueffing | Hermann Ney
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2004

pdf bib
Confidence Estimation for Machine Translation
John Blatz | Erin Fitzgerald | George Foster | Simona Gandrabur | Cyril Goutte | Alex Kulesza | Alberto Sanchis | Nicola Ueffing
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf bib
A novel string-to-string distance measure with applications to machine translation evaluation
Gregor Leusch | Nicola Ueffing | Hermann Ney
Proceedings of Machine Translation Summit IX: Papers

We introduce a string-to-string distance measure which extends the edit distance by block transpositions as constant cost edit operation. An algorithm for the calculation of this distance measure in polynomial time is presented. We then demonstrate how this distance measure can be used as an evaluation criterion in machine translation. The correlation between this evaluation criterion and human judgment is systematically compared with that of other automatic evaluation measures on two translation tasks. In general, like other automatic evaluation measures, the criterion shows low correlation at sentence level, but good correlation at system level.

pdf bib
Confidence measures for statistical machine translation
Nicola Ueffing | Klaus Macherey | Hermann Ney
Proceedings of Machine Translation Summit IX: Papers

In this paper, we present several confidence measures for (statistical) machine translation. We introduce word posterior probabilities for words in the target sentence that can be determined either on a word graph or on an N best list. Two alternative confidence measures that can be calculated on N best lists are proposed. The performance of the measures is evaluated on two different translation tasks: on spontaneously spoken dialogues from the domain of appointment scheduling, and on a collection of technical manuals.

pdf bib
Using POS Information for SMT into Morphologically Rich Languages
Nicola Ueffing | Hermann Ney
10th Conference of the European Chapter of the Association for Computational Linguistics

2002

pdf bib
Generation of Word Graphs in Statistical Machine Translation
Nicola Ueffing | Franz Josef Och | Hermann Ney
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

2001

pdf bib
An Efficient A* Search Algorithm for Statistical Machine Translation
Franz Josef Och | Nicola Ueffing | Hermann Ney
Proceedings of the ACL 2001 Workshop on Data-Driven Methods in Machine Translation