Stefan Riezler

2025

pdf bib abs
Learning to Translate Ambiguous Terminology by Preference Optimization on Post-Edits
Nathaniel Berger | Johannes Eschbach-Dymanus | Miriam Exel | Matthias Huck | Stefan Riezler
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track

In real world translation scenarios, terminology is rarely one-to-one. Instead, multiple valid translations may appear in a terminology dictionary, but correctness of a translation depends on corporate style guides and context. This can be challenging for neural machine translation (NMT) systems. Luckily, in a corporate context, many examples of human post-edits of valid but incorrect terminology exist. The goal of this work is to learn how to disambiguate our terminology based on these corrections. Our approach is based on preference optimization, using the term post-edit as the knowledge to be preferred. While previous work had to rely on unambiguous translation dictionaries to set hard constraints during decoding, or to add soft constraints in the input, our framework requires neither one-to-one dictionaries nor human intervention at decoding time. We report results on English-German post-edited data and find that the optimal combination of supervised fine-tuning and preference optimization, with both term-specific and full sequence objectives, yields statistically significant improvements in term accuracy over a strong translation oriented LLM without significant losses in COMET score. Additionally, we release test sets from our post-edited data and terminology dictionary.

2024

pdf bib abs
Prompting Large Language Models with Human Error Markings for Self-Correcting Machine Translation
Nathaniel Berger | Stefan Riezler | Miriam Exel | Matthias Huck
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)

While large language models (LLMs) pre-trained on massive amounts of unpaired language data have reached the state-of-the-art in machine translation (MT) of general domain texts, post-editing (PE) is still required to correct errors and to enhance term translation quality in specialized domains. In this paper we present a pilot study of enhancing translation memories (TM) produced by PE (source segments, machine translations, and reference translations, henceforth called PE-TM) for the needs of correct and consistent term translation in technical domains. We investigate a light-weight two-step scenario where at inference time, a human translator marks errors in the first translation step, and in a second step a few similar examples are extracted from the PE-TM to prompt an LLM. Our experiment shows that the additional effort of augmenting translations with human error markings guides the LLM to focus on a correction of the marked errors, yielding consistent improvements over automatic PE (APE) and MT from scratch.

pdf bib abs
Text-to-OverpassQL: A Natural Language Interface for Complex Geodata Querying of OpenStreetMap
Michael Staniek | Raphael Schumann | Maike Züfle | Stefan Riezler
Transactions of the Association for Computational Linguistics, Volume 12

We present Text-to-OverpassQL, a task designed to facilitate a natural language interface for querying geodata from OpenStreetMap (OSM). The Overpass Query Language (OverpassQL) allows users to formulate complex database queries and is widely adopted in the OSM ecosystem. Generating Overpass queries from natural language input serves multiple use-cases. It enables novice users to utilize OverpassQL without prior knowledge, assists experienced users with crafting advanced queries, and enables tool-augmented large language models to access information stored in the OSM database. In order to assess the performance of current sequence generation models on this task, we propose OverpassNL,1 a dataset of 8,352 queries with corresponding natural language inputs. We further introduce task specific evaluation metrics and ground the evaluation of the Text-to-OverpassQL task by executing the queries against the OSM database. We establish strong baselines by finetuning sequence-to-sequence models and adapting large language models with in-context examples. The detailed evaluation reveals strengths and weaknesses of the considered learning strategies, laying the foundations for further research into the Text-to-OverpassQL task.

pdf bib abs
Post-edits Are Preferences Too
Nathaniel Berger | Miriam Exel | Matthias Huck | Stefan Riezler
Proceedings of the Ninth Conference on Machine Translation

Preference Optimization (PO) techniques are currently one of the state of the art techniques for fine-tuning large language models (LLMs) on pairwise preference feedback from human annotators. However, in machine translation, this sort of feedback can be difficult to solicit. Additionally, Kreuzer et al. (2018) have shown that, for machine translation, pairwise preferences are less reliable than other forms of human feedback, such as 5-point ratings.We examine post-edits to see if they can be a source of reliable human preferences by construction. In PO, a human annotator is shown sequences $s_1$ and $s_2$ and asked for a preference judgment, while for post-editing, editors create $s_1$ and know that it should be better than $s_2$. We attempt to use these implicit preferences for PO and show that it helps the model move towards post-edit like hypotheses and away from machine translation-like hypotheses. Furthermore, we show that best results are obtained by pre-training the model with supervised fine-tuning (SFT) on post-edits in order to promote post-edit like hypotheses to the top output ranks.

2023

pdf bib abs
Enhancing Supervised Learning with Contrastive Markings in Neural Machine Translation Training
Nathaniel Berger | Miriam Exel | Matthias Huck | Stefan Riezler
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

Supervised learning in Neural Machine Translation (NMT) standardly follows a teacher forcing paradigm where the conditioning context in the model’s prediction is constituted by reference tokens, instead of its own previous predictions. In order to alleviate this lack of exploration in the space of translations, we present a simple extension of standard maximum likelihood estimation by a contrastive marking objective. The additional training signals are extracted automatically from reference translations by comparing the system hypothesis against the reference, and used for up/down-weighting correct/incorrect tokens. The proposed new training procedure requires one additional translation pass over the training set, and does not alter the standard inference setup. We show that training with contrastive markings yields improvements on top of supervised learning, and is especially useful when learning from postedits where contrastive markings indicate human error corrections to the original hypotheses.

pdf bib abs
Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts
Rebekka Hubert | Artem Sokolov | Stefan Riezler
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)

End-to-end automatic speech translation (AST) relies on data that combines audio inputs with text translation outputs. Previous work used existing large parallel corpora of transcriptions and translations in a knowledge distillation (KD) setup to distill a neural machine translation (NMT) into an AST student model. While KD allows using larger pretrained models, the reliance of previous KD approaches on manual audio transcripts in the data pipeline restricts the applicability of this framework to AST. We present an imitation learning approach where a teacher NMT system corrects the errors of an AST student without relying on manual transcripts. We show that the NMT teacher can recover from errors in automatic transcriptions and is able to correct erroneous translations of the AST student, leading to improvements of about 4 BLEU points over the standard AST end-to-end baseline on the English-German CoVoST-2 and MuST-C datasets, respectively. Code and data are publicly available: https://github.com/HubReb/imitkd_ast/releases/tag/v1.1

2022

pdf bib abs
Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas
Raphael Schumann | Stefan Riezler
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Vision and language navigation (VLN) is a challenging visually-grounded language understanding task. Given a natural language navigation instruction, a visual agent interacts with a graph-based environment equipped with panorama images and tries to follow the described route. Most prior work has been conducted in indoor scenarios where best results were obtained for navigation on routes that are similar to the training routes, with sharp drops in performance when testing on unseen environments. We focus on VLN in outdoor scenarios and find that in contrast to indoor VLN, most of the gain in outdoor VLN on unseen data is due to features like junction type embedding or heading delta that are specific to the respective environment graph, while image information plays a very minor role in generalizing VLN to unseen outdoor areas. These findings show a bias to specifics of graph representations of urban environments, demanding that VLN tasks grow in scale and diversity of geographical environments.

pdf bib abs
Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation
Tsz Kin Lam | Shigehiko Schamoni | Stefan Riezler
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

End-to-end speech translation relies on data that pair source-language speech inputs with corresponding translations into a target language. Such data are notoriously scarce, making synthetic data augmentation by back-translation or knowledge distillation a necessary ingredient of end-to-end training. In this paper, we present a novel approach to data augmentation that leverages audio alignments, linguistic properties, and translation. First, we augment a transcription by sampling from a suffix memory that stores text and audio data. Second, we translate the augmented transcript. Finally, we recombine concatenated audio segments and the generated translation. Our method delivers consistent improvements of up to 0.9 and 1.1 BLEU points on top of augmentation with knowledge distillation on five language pairs on CoVoST 2 and on two language pairs on Europarl-ST, respectively.

pdf bib abs
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT
Mayumi Ohta | Julia Kreutzer | Stefan Riezler
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

JoeyS2T is a JoeyNMT extension for speech-to-text tasks such as automatic speech recognition and end-to-end speech translation. It inherits the core philosophy of JoeyNMT, a minimalist NMT toolkit built on PyTorch, seeking simplicity and accessibility. JoeyS2T’s workflow is self-contained, starting from data pre-processing, over model training and prediction to evaluation, and is seamlessly integrated into JoeyNMT’s compact and simple code base. On top of JoeyNMT’s state-of-the-art Transformer-based Encoder-Decoder architecture, JoeyS2T provides speech-oriented components such as convolutional layers, SpecAugment, CTC-loss, and WER evaluation. Despite its simplicity compared to prior implementations, JoeyS2T performs competitively on English speech recognition and English-to-German speech translation benchmarks. The implementation is accompanied by a walk-through tutorial and available on https://github.com/may-/joeys2t.

2021

pdf bib abs
Generating Landmark Navigation Instructions from Maps as a Graph-to-Text Problem
Raphael Schumann | Stefan Riezler
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Car-focused navigation services are based on turns and distances of named streets, whereas navigation instructions naturally used by humans are centered around physical objects called landmarks. We present a neural model that takes OpenStreetMap representations as input and learns to generate navigation instructions that contain visible and salient landmarks from human natural language instructions. Routes on the map are encoded in a location- and rotation-invariant graph representation that is decoded into natural language instructions. Our work is based on a novel dataset of 7,672 crowd-sourced instances that have been verified by human navigation in Street View. Our evaluation shows that the navigation instructions generated by our system have similar properties as human-generated instructions, and lead to successful human navigation in Street View.

pdf bib abs
Don’t Search for a Search Method — Simple Heuristics Suffice for Adversarial Text Attacks
Nathaniel Berger | Stefan Riezler | Sebastian Ebert | Artem Sokolov
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Recently more attention has been given to adversarial attacks on neural networks for natural language processing (NLP). A central research topic has been the investigation of search algorithms and search constraints, accompanied by benchmark algorithms and tasks. We implement an algorithm inspired by zeroth order optimization-based attacks and compare with the benchmark results in the TextAttack framework. Surprisingly, we find that optimization-based methods do not yield any improvement in a constrained setup and slightly benefit from approximate gradient information only in unconstrained setups where search spaces are larger. In contrast, simple heuristics exploiting nearest neighbors without querying the target function yield substantial success rates in constrained setups, and nearly full success rate in unconstrained setups, at an order of magnitude fewer queries. We conclude from these results that current TextAttack benchmark tasks are too easy and constraints are too strict, preventing meaningful research on black-box adversarial text attacks.

pdf bib abs
Error-Aware Interactive Semantic Parsing of OpenStreetMap
Michael Staniek | Stefan Riezler
Proceedings of Second International Combined Workshop on Spatial Language Understanding and Grounded Communication for Robotics

In semantic parsing of geographical queries against real-world databases such as OpenStreetMap (OSM), unique correct answers do not necessarily exist. Instead, the truth might be lying in the eye of the user, who needs to enter an interactive setup where ambiguities can be resolved and parsing mistakes can be corrected. Our work presents an approach to interactive semantic parsing where an explicit error detection is performed, and a clarification question is generated that pinpoints the suspected source of ambiguity or error and communicates it to the human user. Our experimental results show that a combination of entropy-based uncertainty detection and beam search, together with multi-source training on clarification question, initial parse, and user answer, results in improvements of 1.2% F1 score on a parser that already performs at 90.26% on the NLMaps dataset for OSM semantic parsing.

pdf bib abs
Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks
Julia Kreutzer | Stefan Riezler | Carolin Lawrence
Proceedings of the 5th Workshop on Structured Prediction for NLP (SPNLP 2021)

Large volumes of interaction logs can be collected from NLP systems that are deployed in the real world. How can this wealth of information be leveraged? Using such interaction logs in an offline reinforcement learning (RL) setting is a promising approach. However, due to the nature of NLP tasks and the constraints of production systems, a series of challenges arise. We present a concise overview of these challenges and discuss possible solutions.

2020

pdf bib abs
Embedding Meta-Textual Information for Improved Learning to Rank
Toshitaka Kuwa | Shigehiko Schamoni | Stefan Riezler
Proceedings of the 28th International Conference on Computational Linguistics

Neural approaches to learning term embeddings have led to improved computation of similarity and ranking in information retrieval (IR). So far neural representation learning has not been extended to meta-textual information that is readily available for many IR tasks, for example, patent classes in prior-art retrieval, topical information in Wikipedia articles, or product categories in e-commerce data. We present a framework that learns embeddings for meta-textual categories, and optimizes a pairwise ranking objective for improved matching based on combined embeddings of textual and meta-textual information. We show considerable gains in an experimental evaluation on cross-lingual retrieval in the Wikipedia domain for three language pairs, and in the Patent domain for one language pair. Our results emphasize that the mode of combining different types of information is crucial for model improvement.

pdf bib abs
Correct Me If You Can: Learning from Error Corrections and Markings
Julia Kreutzer | Nathaniel Berger | Stefan Riezler
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation

Sequence-to-sequence learning involves a trade-off between signal strength and annotation cost of training data. For example, machine translation data range from costly expert-generated translations that enable supervised learning, to weak quality-judgment feedback that facilitate reinforcement learning. We present the first user study on annotation cost and machine learnability for the less popular annotation mode of error markings. We show that error markings for translations of TED talks from English to German allow precise credit assignment while requiring significantly less human effort than correcting/post-editing, and that error-marked data can be used successfully to fine-tune neural machine translation models.

pdf bib abs
LibriVoxDeEn: A Corpus for German-to-English Speech Translation and German Speech Recognition
Benjamin Beilharz | Xin Sun | Sariya Karimova | Stefan Riezler
Proceedings of the Twelfth Language Resources and Evaluation Conference

We present a corpus of sentence-aligned triples of German audio, German text, and English translation, based on German audio books. The speech translation data consist of 110 hours of audio material aligned to over 50k parallel sentences. An even larger dataset comprising 547 hours of German speech aligned to German text is available for speech recognition. The audio data is read speech and thus low in disfluencies. The quality of audio and sentence alignments has been checked by a manual evaluation, showing that speech alignment quality is in general very high. The sentence alignment quality is comparable to well-used parallel translation data and can be adjusted by cutoffs on the automatic alignment score. To our knowledge, this corpus is to date the largest resource for German speech recognition and for end-to-end German-to-English speech translation.

2019

pdf bib abs
Multi-Task Modeling of Phonographic Languages: Translating Middle Egyptian Hieroglyphs
Philipp Wiesenbach | Stefan Riezler
Proceedings of the 16th International Conference on Spoken Language Translation

Machine translation of ancient languages faces a low-resource problem, caused by the limited amount of available textual source data and their translations. We present a multi-task modeling approach to translating Middle Egyptian that is inspired by recent successful approaches to multi-task learning in end-to-end speech translation. We leverage the phonographic aspect of the hieroglyphic writing system, and show that similar to multi-task learning of speech recognition and translation, joint learning and sharing of structural information between hieroglyph transcriptions, translations, and POS tagging can improve direct translation of hieroglyphs by several BLEU points, using a minimal amount of manual transcriptions.

pdf bib abs
Joey NMT: A Minimalist NMT Toolkit for Novices
Julia Kreutzer | Jasmijn Bastings | Stefan Riezler
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

We present Joey NMT, a minimalist neural machine translation toolkit based on PyTorch that is specifically designed for novices. Joey NMT provides many popular NMT features in a small and simple code base, so that novices can easily and quickly learn to use it and adapt it to their needs. Despite its focus on simplicity, Joey NMT supports classic architectures (RNNs, transformers), fast beam search, weight tying, and more, and achieves performance comparable to more complex toolkits on standard benchmarks. We evaluate the accessibility of our toolkit in a user study where novices with general knowledge about Pytorch and NMT and experts work through a self-contained Joey NMT tutorial, showing that novices perform almost as well as experts in a subsequent code quiz. Joey NMT is available at https://github.com/joeynmt/joeynmt.

pdf bib abs
Self-Regulated Interactive Sequence-to-Sequence Learning
Julia Kreutzer | Stefan Riezler
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Not all types of supervision signals are created equal: Different types of feedback have different costs and effects on learning. We show how self-regulation strategies that decide when to ask for which kind of feedback from a teacher (or from oneself) can be cast as a learning-to-learn problem leading to improved cost-aware sequence-to-sequence learning. In experiments on interactive neural machine translation, we find that the self-regulator discovers an 𝜖-greedy strategy for the optimal cost-quality trade-off by mixing different feedback types including corrections, error markups, and self-supervision. Furthermore, we demonstrate its robustness under domain shift and identify it as a promising alternative to active learning.

pdf bib abs
Learning Neural Sequence-to-Sequence Models from Weak Feedback with Bipolar Ramp Loss
Laura Jehl | Carolin Lawrence | Stefan Riezler
Transactions of the Association for Computational Linguistics, Volume 7

In many machine learning scenarios, supervision by gold labels is not available and conse quently neural models cannot be trained directly by maximum likelihood estimation. In a weak supervision scenario, metric-augmented objectives can be employed to assign feedback to model outputs, which can be used to extract a supervision signal for training. We present several objectives for two separate weakly supervised tasks, machine translation and semantic parsing. We show that objectives should actively discourage negative outputs in addition to promoting a surrogate gold structure. This notion of bipolarity is naturally present in ramp loss objectives, which we adapt to neural models. We show that bipolar ramp loss objectives outperform other non-bipolar ramp loss objectives and minimum risk training on both weakly supervised tasks, as well as on a supervised machine translation task. Additionally, we introduce a novel token-level ramp loss objective, which is able to outperform even the best sequence-level ramp loss on both weakly supervised tasks.

pdf bib
Interactive-Predictive Neural Machine Translation through Reinforcement and Imitation
Tsz Kin Lam | Shigehiko Schamoni | Stefan Riezler
Proceedings of Machine Translation Summit XVII: Research Track

2018

pdf bib abs
A Reinforcement Learning Approach to Interactive-Predictive Neural Machine Translation
Tsz Kin Lam | Julia Kreutzer | Stefan Riezler
Proceedings of the 21st Annual Conference of the European Association for Machine Translation

We present an approach to interactivepredictive neural machine translation that attempts to reduce human effort from three directions: Firstly, instead of requiring humans to select, correct, or delete segments, we employ the idea of learning from human reinforcements in form of judgments on the quality of partial translations. Secondly, human effort is further reduced by using the entropy of word predictions as uncertainty criterion to trigger feedback requests. Lastly, online updates of the model parameters after every interaction allow the model to adapt quickly. We show in simulation experiments that reward signals on partial translations significantly improve character F-score and BLEU compared to feedback on full translations only, while human effort can be reduced to an average number of 5 feedback requests for every input.

pdf bib abs
Can Neural Machine Translation be Improved with User Feedback?
Julia Kreutzer | Shahram Khadivi | Evgeny Matusov | Stefan Riezler
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

We present the first real-world application of methods for improving neural machine translation (NMT) with human reinforcement, based on explicit and implicit user feedback collected on the eBay e-commerce platform. Previous work has been confined to simulation experiments, whereas in this paper we work with real logged feedback for offline bandit learning of NMT parameters. We conduct a thorough analysis of the available explicit user judgments—five-star ratings of translation quality—and show that they are not reliable enough to yield significant improvements in bandit learning. In contrast, we successfully utilize implicit task-based feedback collected in a cross-lingual search task to improve task-specific and machine translation quality metrics.

pdf bib abs
Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning
Julia Kreutzer | Joshua Uyheng | Stefan Riezler
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present a study on reinforcement learning (RL) from human bandit feedback for sequence-to-sequence learning, exemplified by the task of bandit neural machine translation (NMT). We investigate the reliability of human bandit feedback, and analyze the influence of reliability on the learnability of a reward estimator, and the effect of the quality of reward estimates on the overall RL task. Our analysis of cardinal (5-point ratings) and ordinal (pairwise preferences) feedback shows that their intra- and inter-annotator α-agreement is comparable. Best reliability is obtained for standardized cardinal feedback, and cardinal feedback is also easiest to learn and generalize from. Finally, improvements of over 1 BLEU can be obtained by integrating a regression-based reward estimator trained on cardinal feedback for 800 translations into RL for NMT. This shows that RL is possible even from small amounts of fairly reliable human feedback, pointing to a great potential for applications at larger scale.

pdf bib abs
Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback
Carolin Lawrence | Stefan Riezler
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Counterfactual learning from human bandit feedback describes a scenario where user feedback on the quality of outputs of a historic system is logged and used to improve a target system. We show how to apply this learning framework to neural semantic parsing. From a machine learning perspective, the key challenge lies in a proper reweighting of the estimator so as to avoid known degeneracies in counterfactual learning, while still being applicable to stochastic gradient optimization. To conduct experiments with human users, we devise an easy-to-use interface to collect human feedback on semantic parses. Our work is the first to show that semantic parsers can be improved significantly by counterfactual learning from logged human feedback data.

pdf bib
Document-Level Information as Side Constraints for Improved Neural Patent Translation
Laura Jehl | Stefan Riezler
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)

pdf bib
A Dataset and Reranking Method for Multimodal MT of User-Generated Image Captions
Shigehiko Schamoni | Julian Hitschler | Stefan Riezler
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)

2017

pdf bib abs
Counterfactual Learning from Bandit Feedback under Deterministic Logging : A Case Study in Statistical Machine Translation
Carolin Lawrence | Artem Sokolov | Stefan Riezler
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

The goal of counterfactual learning for statistical machine translation (SMT) is to optimize a target SMT system from logged data that consist of user feedback to translations that were predicted by another, historic SMT system. A challenge arises by the fact that risk-averse commercial SMT systems deterministically log the most probable translation. The lack of sufficient exploration of the SMT output space seemingly contradicts the theoretical requirements for counterfactual learning. We show that counterfactual learning from deterministic bandit logs is possible nevertheless by smoothing out deterministic components in learning. This can be achieved by additive and multiplicative control variates that avoid degenerate behavior in empirical risk minimization. Our simulation experiments show improvements of up to 2 BLEU points by counterfactual learning from deterministic bandit feedback.

pdf bib abs
Bandit Structured Prediction for Neural Sequence-to-Sequence Learning
Julia Kreutzer | Artem Sokolov | Stefan Riezler
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Bandit structured prediction describes a stochastic optimization framework where learning is performed from partial feedback. This feedback is received in the form of a task loss evaluation to a predicted output structure, without having access to gold standard structures. We advance this framework by lifting linear bandit learning to neural sequence-to-sequence learning problems using attention-based recurrent neural networks. Furthermore, we show how to incorporate control variates into our learning algorithms for variance reduction and improved generalization. We present an evaluation on a neural machine translation task that shows improvements of up to 5.89 BLEU points for domain adaptation from simulated bandit feedback.

2016

pdf bib abs
Learning to translate from graded and negative relevance information
Laura Jehl | Stefan Riezler
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We present an approach for learning to translate by exploiting cross-lingual link structure in multilingual document collections. We propose a new learning objective based on structured ramp loss, which learns from graded relevance, explicitly including negative relevance information. Our results on English German translation of Wikipedia entries show small, but significant, improvements of our method over an unadapted baseline, even when only a weak relevance signal is used. We also compare our method to monolingual language model adaptation and automatic pseudo-parallel data extraction and find small improvements even over these strong baselines.

pdf bib abs
NLmaps: A Natural Language Interface to Query OpenStreetMap
Carolin Lawrence | Stefan Riezler
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

We present a Natural Language Interface (nlmaps.cl.uni-heidelberg.de) to query OpenStreetMap. Natural language questions about geographical facts are parsed into database queries that can be executed against the OpenStreetMap (OSM) database. After parsing the question, the system provides a text based answer as well as an interactive map with all points of interest and their relevant information marked. Additionally, we provide several options for users to give feedback after a question has been parsed.

pdf bib abs
A Post-editing Interface for Immediate Adaptation in Statistical Machine Translation
Patrick Simianer | Sariya Karimova | Stefan Riezler
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

Adaptive machine translation (MT) systems are a promising approach for improving the effectiveness of computer-aided translation (CAT) environments. There is, however, virtually only theoretical work that examines how such a system could be implemented. We present an open source post-editing interface for adaptive statistical MT, which has in-depth monitoring capabilities and excellent expandability, and can facilitate practical studies. To this end, we designed text-based and graphical post-editing interfaces. The graphical interface offers means for displaying and editing a rich view of the MT output. Our translation systems may learn from post-edits using several weight, language model and novel translation model adaptation techniques, in part by exploiting the output of the graphical interface. In a user study we show that using the proposed interface and adaptation methods, reductions in technical effort and time can be achieved.

pdf bib
Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning
Stefan Riezler | Yoav Goldberg
Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning

pdf bib
A Corpus and Semantic Parser for Multilingual Natural Language Querying of OpenStreetMap
Carolin Haas | Stefan Riezler
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Learning Structured Predictors from Bandit Feedback for Interactive NLP
Artem Sokolov | Julia Kreutzer | Christopher Lo | Stefan Riezler
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Multimodal Pivots for Image Caption Translation
Julian Hitschler | Shigehiko Schamoni | Stefan Riezler
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Integrating a Large, Monolingual Corpus as Translation Memory into Statistical Machine translation
Katharina Wäschle | Stefan Riezler
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
The Heidelberg University English-German translation system for IWSLT 2015
Laura Jehl | Patrick Simianer | Julian HIrschler | Stefan Riezler
Proceedings of the 12th International Workshop on Spoken Language Translation: Evaluation Campaign

pdf bib
Bandit structured prediction for learning from partial feedback in statistical machine translation
Artem Sokolov | Stefan Riezler | Tanguy Urvoy
Proceedings of Machine Translation Summit XV: Papers

pdf bib
Response-based learning for patent translation
Stefan Riezler
Proceedings of the 6th Workshop on Patent and Scientific Literature Translation

pdf bib
A Coactive Learning View of Online Structured Prediction in Statistical Machine Translation
Artem Sokolov | Stefan Riezler | Shay B. Cohen
Proceedings of the Nineteenth Conference on Computational Natural Language Learning

pdf bib
Bag-of-Words Forced Decoding for Cross-Lingual Information Retrieval
Felix Hieber | Stefan Riezler
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Response-based Learning for Machine Translation of Open-domain Database Queries
Carolin Haas | Stefan Riezler
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
QUality Estimation from ScraTCH (QUETCH): Deep Learning for Word-level Translation Quality Estimation
Julia Kreutzer | Shigehiko Schamoni | Stefan Riezler
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
Integrating a Large, Monolingual Corpus as Translation Memory into Statistical Machine Translation
Katharina Wäschle | Stefan Riezler
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

2014

pdf bib abs
Offline extraction of overlapping phrases for hierarchical phrase-based translation
Sariya Karimova | Patrick Simianer | Stefan Riezler
Proceedings of the 11th International Workshop on Spoken Language Translation: Papers

Standard SMT decoders operate by translating disjoint spans of input words, thus discarding information in form of overlapping phrases that is present at phrase extraction time. The use of overlapping phrases in translation may enhance fluency in positions that would otherwise be phrase boundaries, they may provide additional statistical support for long and rare phrases, and they may generate new phrases that have never been seen in the training data. We show how to extract overlapping phrases offline for hierarchical phrasebased SMT, and how to extract features and tune weights for the new phrases. We find gains of 0.3 − 0.6 BLEU points over discriminatively trained hierarchical phrase-based SMT systems on two datasets for German-to-English translation.

pdf bib
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
Shuly Wintner | Sharon Goldwater | Stefan Riezler
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers
Shuly Wintner | Stefan Riezler | Sharon Goldwater
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers

pdf bib
Last Words: On the Problem of Theoretical Terms in Empirical Computational Linguistics
Stefan Riezler
Computational Linguistics, Volume 40, Issue 1 - March 2014

pdf bib
Response-based Learning for Grounded Machine Translation
Stefan Riezler | Patrick Simianer | Carolin Haas
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Learning Translational and Knowledge-based Similarities from Relevance Rankings for Cross-Language Retrieval
Shigehiko Schamoni | Felix Hieber | Artem Sokolov | Stefan Riezler
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2013

pdf bib abs
The Heidelberg University machine translation systems for IWSLT2013
Patrick Simianer | Laura Jehl | Stefan Riezler
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign

We present our systems for the machine translation evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2013. We submitted systems for three language directions: German-to-English, Russian-to-English and English-to-Russian. The focus of our approaches lies on effective usage of the in-domain parallel training data. Therefore, we use the training data to tune parameter weights for millions of sparse lexicalized features using efficient parallelized stochastic learning techniques. For German-to-English we incorporate syntax features. We combine all of our systems with large language models. For the systems involving Russian we also incorporate more data into building of the translation models.

pdf bib
Generative and Discriminative Methods for Online Adaptation in SMT
Katharina Wäschle | Patrick Simianer | Nicola Bertoldi | Stefan Riezler | Marcello Federico
Proceedings of Machine Translation Summit XIV: Papers

pdf bib
Boosting Cross-Language Retrieval by Learning Bilingual Phrase Associations from Relevance Rankings
Artem Sokokov | Laura Jehl | Felix Hieber | Stefan Riezler
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Task Alternation in Parallel Sentence Retrieval for Twitter Translation
Felix Hieber | Laura Jehl | Stefan Riezler
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Multi-Task Learning for Improved Discriminative Training in SMT
Patrick Simianer | Stefan Riezler
Proceedings of the Eighth Workshop on Statistical Machine Translation

Stefan Riezler

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2010

2008

2007

2006

2005

2004

2003

2002

2000

1999

Co-authors

Venues