Alexander Fraser

Also published as: Alex Fraser


2021

pdf bib
A Comparison of Sentence-Weighting Techniques for NMT
Simon Rieß | Matthias Huck | Alex Fraser
Proceedings of Machine Translation Summit XVIII: Research Track

Sentence weighting is a simple and powerful domain adaptation technique. We carry out domain classification for computing sentence weights with 1) language model cross entropy difference 2) a convolutional neural network 3) a Recursive Neural Tensor Network. We compare these approaches with regard to domain classification accuracy and and study the posterior probability distributions. Then we carry out NMT experiments in the scenario where we have no in-domain parallel corpora and and only very limited in-domain monolingual corpora. Here and we use the domain classifier to reweight the sentences of our out-of-domain training corpus. This leads to improvements of up to 2.1 BLEU for German to English translation.

pdf bib
Cross-Lingual Transfer Learning for Hate Speech Detection
Irina Bigoulaeva | Viktor Hangya | Alexander Fraser
Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion

We address the task of automatic hate speech detection for low-resource languages. Rather than collecting and annotating new hate speech data, we show how to use cross-lingual transfer learning to leverage already existing data from higher-resource languages. Using bilingual word embeddings based classifiers we achieve good performance on the target language by training only on the source dataset. Using our transferred system we bootstrap on unlabeled target language data, improving the performance of standard cross-lingual transfer approaches. We use English as a high resource language and German as the target language for which only a small amount of annotated corpora are available. Our results indicate that cross-lingual transfer learning together with our approach to leverage additional unlabeled data is an effective way of achieving good performance on low-resource target languages without the need for any target-language annotations.

pdf bib
Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation
Alexandra Chronopoulou | Dario Stojanovski | Alexander Fraser
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Successful methods for unsupervised neural machine translation (UNMT) employ cross-lingual pretraining via self-supervision, often in the form of a masked language modeling or a sequence generation task, which requires the model to align the lexical- and high-level representations of the two languages. While cross-lingual pretraining works for similar languages with abundant corpora, it performs poorly in low-resource and distant languages. Previous research has shown that this is because the representations are not sufficiently aligned. In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings. Empirical results demonstrate improved performance both on UNMT (up to 4.5 BLEU) and bilingual lexicon induction using our method compared to a UNMT baseline.

pdf bib
Anchor-based Bilingual Word Embeddings for Low-Resource Languages
Tobias Eder | Viktor Hangya | Alexander Fraser
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Good quality monolingual word embeddings (MWEs) can be built for languages which have large amounts of unlabeled text. MWEs can be aligned to bilingual spaces using only a few thousand word translation pairs. For low resource languages training MWEs monolingually results in MWEs of poor quality, and thus poor bilingual word embeddings (BWEs) as well. This paper proposes a new approach for building BWEs in which the vector space of the high resource source language is used as a starting point for training an embedding space for the low resource target language. By using the source vectors as anchors the vector spaces are automatically aligned during training. We experiment on English-German, English-Hiligaynon and English-Macedonian. We show that our approach results not only in improved BWEs and bilingual lexicon induction performance, but also in improved target language MWE quality as measured using monolingual word similarity.

pdf bib
Addressing Zero-Resource Domains Using Document-Level Context in Neural Machine Translation
Dario Stojanovski | Alexander Fraser
Proceedings of the Second Workshop on Domain Adaptation for NLP

Achieving satisfying performance in machine translation on domains for which there is no training data is challenging. Traditional supervised domain adaptation is not suitable for addressing such zero-resource domains because it relies on in-domain parallel data. We show that when in-domain parallel data is not available, access to document-level context enables better capturing of domain generalities compared to only having access to a single sentence. Having access to more information provides a more reliable domain estimation. We present two document-level Transformer models which are capable of using large context sizes and we compare these models against strong Transformer baselines. We obtain improvements for the two zero-resource domains we study. We additionally provide an analysis where we vary the amount of context and look at the case where in-domain data is available.

2020

pdf bib
ContraCAT: Contrastive Coreference Analytical Templates for Machine Translation
Dario Stojanovski | Benno Krojer | Denis Peskov | Alexander Fraser
Proceedings of the 28th International Conference on Computational Linguistics

Recent high scores on pronoun translation using context-aware neural machine translation have suggested that current approaches work well. ContraPro is a notable example of a contrastive challenge set for English→German pronoun translation. The high scores achieved by transformer models may suggest that they are able to effectively model the complicated set of inferences required to carry out pronoun translation. This entails the ability to determine which entities could be referred to, identify which entity a source-language pronoun refers to (if any), and access the target-language grammatical gender for that entity. We first show through a series of targeted adversarial attacks that in fact current approaches are not able to model all of this information well. Inserting small amounts of distracting information is enough to strongly reduce scores, which should not be the case. We then create a new template test set ContraCAT, designed to individually assess the ability to handle the specific steps necessary for successful pronoun translation. Our analyses show that current approaches to context-aware NMT rely on a set of surface heuristics, which break down when translations require real reasoning. We also propose an approach for augmenting the training data, with some improvements.

pdf bib
Combining Word Embeddings with Bilingual Orthography Embeddings for Bilingual Dictionary Induction
Silvia Severini | Viktor Hangya | Alexander Fraser | Hinrich Schütze
Proceedings of the 28th International Conference on Computational Linguistics

Bilingual dictionary induction (BDI) is the task of accurately translating words to the target language. It is of great importance in many low-resource scenarios where cross-lingual training data is not available. To perform BDI, bilingual word embeddings (BWEs) are often used due to their low bilingual training signal requirements. They achieve high performance, but problematic cases still remain, such as the translation of rare words or named entities, which often need to be transliterated. In this paper, we enrich BWE-based BDI with transliteration information by using Bilingual Orthography Embeddings (BOEs). BOEs represent source and target language transliteration word pairs with similar vectors. A key problem in our BDI setup is to decide which information source – BWEs (or semantics) vs. BOEs (or orthography) – is more reliable for a particular word pair. We propose a novel classification-based BDI system that uses BWEs, BOEs and a number of other features to make this decision. We test our system on English-Russian BDI and show improved performance. In addition, we show the effectiveness of our BOEs by successfully using them for transliteration mining based on cosine similarity.

pdf bib
Exploring Bilingual Word Embeddings for Hiligaynon, a Low-Resource Language
Leah Michel | Viktor Hangya | Alexander Fraser
Proceedings of the 12th Language Resources and Evaluation Conference

This paper investigates the use of bilingual word embeddings for mining Hiligaynon translations of English words. There is very little research on Hiligaynon, an extremely low-resource language of Malayo-Polynesian origin with over 9 million speakers in the Philippines (we found just one paper). We use a publicly available Hiligaynon corpus with only 300K words, and match it with a comparable corpus in English. As there are no bilingual resources available, we manually develop a English-Hiligaynon lexicon and use this to train bilingual word embeddings. But we fail to mine accurate translations due to the small amount of data. To find out if the same holds true for a related language pair, we simulate the same low-resource setup on English to German and arrive at similar results. We then vary the size of the comparable English and German corpora to determine the minimum corpus size necessary to achieve competitive results. Further, we investigate the role of the seed lexicon. We show that with the same corpus size but with a smaller seed lexicon, performance can surpass results of previous studies. We release the lexicon of 1,200 English-Hiligaynon word pairs we created to encourage further investigation.

pdf bib
On the Language Neutrality of Pre-trained Multilingual Representations
Jindřich Libovický | Rudolf Rosa | Alexander Fraser
Findings of the Association for Computational Linguistics: EMNLP 2020

Multilingual contextual embeddings, such as multilingual BERT and XLM-RoBERTa, have proved useful for many multi-lingual tasks. Previous work probed the cross-linguality of the representations indirectly using zero-shot transfer learning on morphological and syntactic tasks. We instead investigate the language-neutrality of multilingual contextual embeddings directly and with respect to lexical semantics. Our results show that contextual embeddings are more language-neutral and, in general, more informative than aligned static word-type embeddings, which are explicitly trained for language neutrality. Contextual embeddings are still only moderately language-neutral by default, so we propose two simple methods for achieving stronger language neutrality: first, by unsupervised centering of the representation for each language and second, by fitting an explicit projection on small parallel data. Besides, we show how to reach state-of-the-art accuracy on language identification and match the performance of statistical methods for word alignment of parallel sentences without using parallel data.

pdf bib
LMU Bilingual Dictionary Induction System with Word Surface Similarity Scores for BUCC 2020
Silvia Severini | Viktor Hangya | Alexander Fraser | Hinrich Schütze
Proceedings of the 13th Workshop on Building and Using Comparable Corpora

The task of Bilingual Dictionary Induction (BDI) consists of generating translations for source language words which is important in the framework of machine translation (MT). The aim of the BUCC 2020 shared task is to perform BDI on various language pairs using comparable corpora. In this paper, we present our approach to the task of English-German and English-Russian language pairs. Our system relies on Bilingual Word Embeddings (BWEs) which are often used for BDI when only a small seed lexicon is available making them particularly effective in a low-resource setting. On the other hand, they perform well on high frequency words only. In order to improve the performance on rare words as well, we combine BWE based word similarity with word surface similarity methods, such as orthography In addition to the often used top-n translation method, we experiment with a margin based approach aiming for dynamic number of translations for each source word. We participate in both the open and closed tracks of the shared task and we show improved results of our method compared to simple vector similarity based approaches. Our system was ranked in the top-3 teams and achieved the best results for English-Russian.

pdf bib
Modeling Word Formation in English–German Neural Machine Translation
Marion Weller-Di Marco | Alexander Fraser
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

This paper studies strategies to model word formation in NMT using rich linguistic information, namely a word segmentation approach that goes beyond splitting into substrings by considering fusional morphology. Our linguistically sound segmentation is combined with a method for target-side inflection to accommodate modeling word formation. The best system variants employ source-side morphological analysis and model complex target-side words, improving over a standard system.

pdf bib
Proceedings of the Fifth Conference on Machine Translation
Loïc Barrault | Ondřej Bojar | Fethi Bougares | Rajen Chatterjee | Marta R. Costa-jussà | Christian Federmann | Mark Fishel | Alexander Fraser | Yvette Graham | Paco Guzman | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Makoto Morishita | Christof Monz | Masaaki Nagata | Toshiaki Nakazawa | Matteo Negri
Proceedings of the Fifth Conference on Machine Translation

pdf bib
Findings of the WMT 2020 Shared Tasks in Unsupervised MT and Very Low Resource Supervised MT
Alexander Fraser
Proceedings of the Fifth Conference on Machine Translation

We describe the WMT 2020 Shared Tasks in Unsupervised MT and Very Low Resource Supervised MT. In both tasks, the community studied German to Upper Sorbian and Upper Sorbian to German MT, which is a very realistic machine translation scenario (unlike the simulated scenarios used in particular in much of the unsupervised MT work in the past). We were able to obtain most of the digital data available for Upper Sorbian, a minority language of Germany, which was the original motivation for the Unsupervised MT shared task. As we were defining the task, we also obtained a small amount of parallel data (about 60000 parallel sentences), allowing us to offer a Very Low Resource Supervised MT task as well. Six primary systems participated in the unsupervised shared task, two of these systems used additional data beyond the data released by the organizers. Ten primary systems participated in the very low resource supervised task. The paper discusses the background, presents the tasks and results, and discusses best practices for the future.

pdf bib
The LMU Munich System for the WMT 2020 Unsupervised Machine Translation Shared Task
Alexandra Chronopoulou | Dario Stojanovski | Viktor Hangya | Alexander Fraser
Proceedings of the Fifth Conference on Machine Translation

This paper describes the submission of LMU Munich to the WMT 2020 unsupervised shared task, in two language directions, German↔Upper Sorbian. Our core unsupervised neural machine translation (UNMT) system follows the strategy of Chronopoulou et al. (2020), using a monolingual pretrained language generation model (on German) and fine-tuning it on both German and Upper Sorbian, before initializing a UNMT model, which is trained with online backtranslation. Pseudo-parallel data obtained from an unsupervised statistical machine translation (USMT) system is used to fine-tune the UNMT model. We also apply BPE-Dropout to the low resource (Upper Sorbian) data to obtain a more robust system. We additionally experiment with residual adapters and find them useful in the Upper Sorbian→German direction. We explore sampling during backtranslation and curriculum learning to use SMT translations in a more principled way. Finally, we ensemble our best-performing systems and reach a BLEU score of 32.4 on German→Upper Sorbian and 35.2 on Upper Sorbian→German.

pdf bib
The LMU Munich System for the WMT20 Very Low Resource Supervised MT Task
Jindřich Libovický | Viktor Hangya | Helmut Schmid | Alexander Fraser
Proceedings of the Fifth Conference on Machine Translation

We present our systems for the WMT20 Very Low Resource MT Task for translation between German and Upper Sorbian. For training our systems, we generate synthetic data by both back- and forward-translation. Additionally, we enrich the training data with German-Czech translated from Czech to Upper Sorbian by an unsupervised statistical MT system incorporating orthographically similar word pairs and transliterations of OOV words. Our best translation system between German and Sorbian is based on transfer learning from a Czech-German system and scores 12 to 13 BLEU higher than a baseline system built using the available parallel data only.

pdf bib
Towards Reasonably-Sized Character-Level Transformer NMT by Finetuning Subword Systems
Jindřich Libovický | Alexander Fraser
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Applying the Transformer architecture on the character level usually requires very deep architectures that are difficult and slow to train. These problems can be partially overcome by incorporating a segmentation into tokens in the model. We show that by initially training a subword model and then finetuning it on characters, we can obtain a neural machine translation model that works at the character level without requiring token segmentation. We use only the vanilla 6-layer Transformer Base architecture. Our character-level models better capture morphological phenomena and show more robustness to noise at the expense of somewhat worse overall translation quality. Our study is a significant step towards high-performance and easy to train character-based models that are not extremely large.

pdf bib
Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT
Alexandra Chronopoulou | Dario Stojanovski | Alexander Fraser
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Using a language model (LM) pretrained on two languages with large monolingual data in order to initialize an unsupervised neural machine translation (UNMT) system yields state-of-the-art results. When limited data is available for one language, however, this method leads to poor translations. We present an effective approach that reuses an LM that is pretrained only on the high-resource language. The monolingual LM is fine-tuned on both languages and is then used to initialize a UNMT model. To reuse the pretrained LM, we have to modify its predefined vocabulary, to account for the new language. We therefore propose a novel vocabulary extension method. Our approach, RE-LM, outperforms a competitive cross-lingual pretraining model (XLM) in English-Macedonian (En-Mk) and English-Albanian (En-Sq), yielding more than +8.3 BLEU points for all four translation directions.

pdf bib
Towards Handling Compositionality in Low-Resource Bilingual Word Induction
Viktor Hangya | Alexander Fraser
Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)

2019

pdf bib
Cross-lingual Annotation Projection Is Effective for Neural Part-of-Speech Tagging
Matthias Huck | Diana Dutka | Alexander Fraser
Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects

We tackle the important task of part-of-speech tagging using a neural model in the zero-resource scenario, where we have no access to gold-standard POS training data. We compare this scenario with the low-resource scenario, where we have access to a small amount of gold-standard POS training data. Our experiments focus on Ukrainian as a representative of under-resourced languages. Russian is highly related to Ukrainian, so we exploit gold-standard Russian POS tags. We consider four techniques to perform Ukrainian POS tagging: zero-shot tagging and cross-lingual annotation projection (for the zero-resource scenario), and compare these with self-training and multilingual learning (for the low-resource scenario). We find that cross-lingual annotation projection works particularly well in the zero-resource scenario.

pdf bib
The LMU Munich Unsupervised Machine Translation System for WMT19
Dario Stojanovski | Viktor Hangya | Matthias Huck | Alexander Fraser
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

We describe LMU Munich’s machine translation system for German→Czech translation which was used to participate in the WMT19 shared task on unsupervised news translation. We train our model using monolingual data only from both languages. The final model is an unsupervised neural model using established techniques for unsupervised translation such as denoising autoencoding and online back-translation. We bootstrap the model with masked language model pretraining and enhance it with back-translations from an unsupervised phrase-based system which is itself bootstrapped using unsupervised bilingual word embeddings.

pdf bib
Combining Local and Document-Level Context: The LMU Munich Neural Machine Translation System at WMT19
Dario Stojanovski | Alexander Fraser
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

We describe LMU Munich’s machine translation system for English→German translation which was used to participate in the WMT19 shared task on supervised news translation. We specifically participated in the document-level MT track. The system used as a primary submission is a context-aware Transformer capable of both rich modeling of limited contextual information and integration of large-scale document-level context with a less rich representation. We train this model by fine-tuning a big Transformer baseline. Our experimental results show that document-level context provides for large improvements in translation quality, and adding a rich representation of the previous sentence provides a small additional gain.

pdf bib
Improving Anaphora Resolution in Neural Machine Translation Using Curriculum Learning
Dario Stojanovski | Alexander Fraser
Proceedings of Machine Translation Summit XVII: Research Track

pdf bib
Unsupervised Parallel Sentence Extraction with Parallel Segment Detection Helps Machine Translation
Viktor Hangya | Alexander Fraser
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Mining parallel sentences from comparable corpora is important. Most previous work relies on supervised systems, which are trained on parallel data, thus their applicability is problematic in low-resource scenarios. Recent developments in building unsupervised bilingual word embeddings made it possible to mine parallel sentences based on cosine similarities of source and target language words. We show that relying only on this information is not enough, since sentences often have similar words but different meanings. We detect continuous parallel segments in sentence pair candidates and rely on them when mining parallel sentences. We show better mining accuracy on three language pairs in a standard shared task on artificial data. We also provide the first experiments showing that parallel sentences mined from real life sources improve unsupervised MT. Our code is available, we hope it will be used to support low-resource MT research.

pdf bib
Better OOV Translation with Bilingual Terminology Mining
Matthias Huck | Viktor Hangya | Alexander Fraser
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Unseen words, also called out-of-vocabulary words (OOVs), are difficult for machine translation. In neural machine translation, byte-pair encoding can be used to represent OOVs, but they are still often incorrectly translated. We improve the translation of OOVs in NMT using easy-to-obtain monolingual data. We look for OOVs in the text to be translated and translate them using simple-to-construct bilingual word embeddings (BWEs). In our MT experiments we take the 5-best candidates, which is motivated by intrinsic mining experiments. Using all five of the proposed target language words as queries we mine target-language sentences. We then back-translate, forcing the back-translation of each of the five proposed target-language OOV-translation-candidates to be the original source-language OOV. We show that by using this synthetic data to fine-tune our system the translation of OOVs can be dramatically improved. In our experiments we use a system trained on Europarl and mine sentences containing medical terms from monolingual data.

2018

pdf bib
Evaluating bilingual word embeddings on the long tail
Fabienne Braune | Viktor Hangya | Tobias Eder | Alexander Fraser
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

Bilingual word embeddings are useful for bilingual lexicon induction, the task of mining translations of given words. Many studies have shown that bilingual word embeddings perform well for bilingual lexicon induction but they focused on frequent words in general domains. For many applications, bilingual lexicon induction of rare and domain-specific words is of critical importance. Therefore, we design a new task to evaluate bilingual word embeddings on rare words in different domains. We show that state-of-the-art approaches fail on this task and present simple new techniques to improve bilingual word embeddings for mining rare words. We release new gold standard datasets and code to stimulate research on this task.

pdf bib
Two Methods for Domain Adaptation of Bilingual Tasks: Delightfully Simple and Broadly Applicable
Viktor Hangya | Fabienne Braune | Alexander Fraser | Hinrich Schütze
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Bilingual tasks, such as bilingual lexicon induction and cross-lingual classification, are crucial for overcoming data sparsity in the target language. Resources required for such tasks are often out-of-domain, thus domain adaptation is an important problem here. We make two contributions. First, we test a delightfully simple method for domain adaptation of bilingual word embeddings. We evaluate these embeddings on two bilingual tasks involving different domains: cross-lingual twitter sentiment classification and medical bilingual lexicon induction. Second, we tailor a broadly applicable semi-supervised classification method from computer vision to these tasks. We show that this method also helps in low-resource setups. Using both methods together we achieve large improvements over our baselines, by using only additional unlabeled data.

pdf bib
Embedding Learning Through Multilingual Concept Induction
Philipp Dufter | Mengjie Zhao | Martin Schmitt | Alexander Fraser | Hinrich Schütze
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present a new method for estimating vector space representations of words: embedding learning by concept induction. We test this method on a highly parallel corpus and learn semantic representations of words in 1259 different languages in a single common space. An extensive experimental evaluation on crosslingual word similarity and sentiment analysis indicates that concept-based multilingual embedding learning performs better than previous approaches.

pdf bib
Neural Morphological Tagging of Lemma Sequences for Machine Translation
Costanza Conforti | Matthias Huck | Alexander Fraser
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)

pdf bib
Coreference and Coherence in Neural Machine Translation: A Study Using Oracle Experiments
Dario Stojanovski | Alexander Fraser
Proceedings of the Third Conference on Machine Translation: Research Papers

Cross-sentence context can provide valuable information in Machine Translation and is critical for translation of anaphoric pronouns and for providing consistent translations. In this paper, we devise simple oracle experiments targeting coreference and coherence. Oracles are an easy way to evaluate the effect of different discourse-level phenomena in NMT using BLEU and eliminate the necessity to manually define challenge sets for this purpose. We propose two context-aware NMT models and compare them against models working on a concatenation of consecutive sentences. Concatenation models perform better, but are computationally expensive. We show that NMT models taking advantage of context oracle signals can achieve considerable gains in BLEU, of up to 7.02 BLEU for coreference and 1.89 BLEU for coherence on subtitles translation. Access to strong signals allows us to make clear comparisons between context-aware models.

pdf bib
The LMU Munich Unsupervised Machine Translation Systems
Dario Stojanovski | Viktor Hangya | Matthias Huck | Alexander Fraser
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

We describe LMU Munich’s unsupervised machine translation systems for English↔German translation. These systems were used to participate in the WMT18 news translation shared task and more specifically, for the unsupervised learning sub-track. The systems are trained on English and German monolingual data only and exploit and combine previously proposed techniques such as using word-by-word translated data based on bilingual word embeddings, denoising and on-the-fly backtranslation.

pdf bib
LMU Munich’s Neural Machine Translation Systems at WMT 2018
Matthias Huck | Dario Stojanovski | Viktor Hangya | Alexander Fraser
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

We present the LMU Munich machine translation systems for the English–German language pair. We have built neural machine translation systems for both translation directions (English→German and German→English) and for two different domains (the biomedical domain and the news domain). The systems were used for our participation in the WMT18 biomedical translation task and in the shared task on machine translation of news. The main focus of our recent system development efforts has been on achieving improvements in the biomedical domain over last year’s strong biomedical translation engine for English→German (Huck et al., 2017a). Considerable progress has been made in the latter task, which we report on in this paper.

pdf bib
An Unsupervised System for Parallel Corpus Filtering
Viktor Hangya | Alexander Fraser
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

In this paper we describe LMU Munich’s submission for the WMT 2018 Parallel Corpus Filtering shared task which addresses the problem of cleaning noisy parallel corpora. The task of mining and cleaning parallel sentences is important for improving the quality of machine translation systems, especially for low-resource languages. We tackle this problem in a fully unsupervised fashion relying on bilingual word embeddings created without any bilingual signal. After pre-filtering noisy data we rank sentence pairs by calculating bilingual sentence-level similarities and then remove redundant data by employing monolingual similarity as well. Our unsupervised system achieved good performance during the official evaluation of the shared task, scoring only a few BLEU points behind the best systems, while not requiring any parallel training data.

2017

pdf bib
Statistical Models for Unsupervised, Semi-Supervised Supervised Transliteration Mining
Hassan Sajjad | Helmut Schmid | Alexander Fraser | Hinrich Schütze
Computational Linguistics, Volume 43, Issue 2 - June 2017

We present a generative model that efficiently mines transliteration pairs in a consistent fashion in three different settings: unsupervised, semi-supervised, and supervised transliteration mining. The model interpolates two sub-models, one for the generation of transliteration pairs and one for the generation of non-transliteration pairs (i.e., noise). The model is trained on noisy unlabeled data using the EM algorithm. During training the transliteration sub-model learns to generate transliteration pairs and the fixed non-transliteration model generates the noise pairs. After training, the unlabeled data is disambiguated based on the posterior probabilities of the two sub-models. We evaluate our transliteration mining system on data from a transliteration mining shared task and on parallel corpora. For three out of four language pairs, our system outperforms all semi-supervised and supervised systems that participated in the NEWS 2010 shared task. On word pairs extracted from parallel corpora with fewer than 2% transliteration pairs, our system achieves up to 86.7% F-measure with 77.9% precision and 97.8% recall.

pdf bib
Modeling Target-Side Inflection in Neural Machine Translation
Aleš Tamchyna | Marion Weller-Di Marco | Alexander Fraser
Proceedings of the Second Conference on Machine Translation

pdf bib
Target-side Word Segmentation Strategies for Neural Machine Translation
Matthias Huck | Simon Riess | Alexander Fraser
Proceedings of the Second Conference on Machine Translation

pdf bib
LMU Munich’s Neural Machine Translation Systems for News Articles and Health Information Texts
Matthias Huck | Fabienne Braune | Alexander Fraser
Proceedings of the Second Conference on Machine Translation

pdf bib
Annotating tense, mood and voice for English, French and German
Anita Ramm | Sharid Loáiciga | Annemarie Friedrich | Alexander Fraser
Proceedings of ACL 2017, System Demonstrations

pdf bib
Producing Unseen Morphological Variants in Statistical Machine Translation
Matthias Huck | Aleš Tamchyna | Ondřej Bojar | Alexander Fraser
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Translating into morphologically rich languages is difficult. Although the coverage of lemmas may be reasonable, many morphological variants cannot be learned from the training data. We present a statistical translation system that is able to produce these inflected word forms. Different from most previous work, we do not separate morphological prediction from lexical choice into two consecutive steps. Our approach is novel in that it is integrated in decoding and takes advantage of context information from both the source language and the target language sides.

pdf bib
Addressing Problems across Linguistic Levels in SMT: Combining Approaches to Model Morphology, Syntax and Lexical Choice
Marion Weller-Di Marco | Alexander Fraser | Sabine Schulte im Walde
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Many errors in phrase-based SMT can be attributed to problems on three linguistic levels: morphological complexity in the target language, structural differences and lexical choice. We explore combinations of linguistically motivated approaches to address these problems in English-to-German SMT and show that they are complementary to one another, but also that the popular verbal pre-ordering can cause problems on the morphological and lexical level. A discriminative classifier can overcome these problems, in particular when enriching standard lexical features with features geared towards verbal inflection.

2016

pdf bib
Target-Side Context for Discriminative Models in Statistical Machine Translation
Aleš Tamchyna | Alexander Fraser | Ondřej Bojar | Marcin Junczys-Dowmunt
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
HimL: Health in my language
Barry Haddow | Alex Fraser
Proceedings of the 19th Annual Conference of the European Association for Machine Translation: Projects/Products

pdf bib
Modeling verbal inflection for English to German SMT
Anita Ramm | Alexander Fraser
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers

pdf bib
Modeling Complement Types in Phrase-Based SMT
Marion Weller-Di Marco | Alexander Fraser | Sabine Schulte im Walde
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers

pdf bib
A Framework for Discriminative Rule Selection in Hierarchical Moses
Fabienne Braune | Alexander Fraser | Hal Daumé III | Aleš Tamchyna
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers

pdf bib
The Edinburgh/LMU Hierarchical Machine Translation System for WMT 2016
Matthias Huck | Alexander Fraser | Barry Haddow
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf bib
The QT21/HimL Combined Machine Translation System
Jan-Thorsten Peter | Tamer Alkhouli | Hermann Ney | Matthias Huck | Fabienne Braune | Alexander Fraser | Aleš Tamchyna | Ondřej Bojar | Barry Haddow | Rico Sennrich | Frédéric Blain | Lucia Specia | Jan Niehues | Alex Waibel | Alexandre Allauzen | Lauriane Aufrant | Franck Burlot | Elena Knyazeva | Thomas Lavergne | François Yvon | Mārcis Pinnis | Stella Frank
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf bib
CUNI-LMU Submissions in WMT2016: Chimera Constrained and Beaten
Aleš Tamchyna | Roman Sudarikov | Ondřej Bojar | Alexander Fraser
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

2015

pdf bib
Rule Selection with Soft Syntactic Features for String-to-Tree Statistical Machine Translation
Fabienne Braune | Nina Seemann | Alexander Fraser
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Joint Lemmatization and Morphological Tagging with Lemming
Thomas Müller | Ryan Cotterell | Alexander Fraser | Hinrich Schütze
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Predicting Prepositions for SMT
Marion Weller | Alexander Fraser | Sabine Schulte im Walde
Proceedings of the Ninth Workshop on Syntax, Semantics and Structure in Statistical Translation

pdf bib
CimS - The CIS and IMS Joint Submission to WMT 2015 addressing morphological and syntactic differences in English to German SMT
Fabienne Cap | Marion Weller | Anita Ramm | Alexander Fraser
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
Target-Side Generation of Prepositions for SMT
Marion Weller | Alexander Fraser | Sabine Schulte im Walde
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
The Operation Sequence Model—Combining N-Gram-Based and Phrase-Based Statistical Machine Translation
Nadir Durrani | Helmut Schmid | Alexander Fraser | Philipp Koehn | Hinrich Schütze
Computational Linguistics, Volume 41, Issue 2 - June 2015

pdf bib
Labeled Morphological Segmentation with Semi-Markov Models
Ryan Cotterell | Thomas Müller | Alexander Fraser | Hinrich Schütze
Proceedings of the Nineteenth Conference on Computational Natural Language Learning

2014

pdf bib
How to Produce Unseen Teddy Bears: Improved Morphological Processing of Compounds in SMT
Fabienne Cap | Alexander Fraser | Marion Weller | Aoife Cahill
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Using noun class information to model selectional preferences for translating prepositions in SMT
Marion Weller | Sabine Schulte im Walde | Alexander Fraser
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track

Translating prepositions is a difficult and under-studied problem in SMT. We present a novel method to improve the translation of prepositions by using noun classes to model their selectional preferences. We compare three variants of noun class information: (i) classes induced from the lexical resource GermaNet or obtained from clusterings based on either (ii) window information or (iii) syntactic features. Furthermore, we experiment with PP rule generalization. While we do not significantly improve over the baseline, our results demonstrate that (i) integrating selectional preferences as rigid class annotation in the parse tree is sub-optimal, and that (ii) clusterings based on window co-occurrence are more robust than syntax-based clusters or GermaNet classes for the task of modeling selectional preferences.

pdf bib
CimS – The CIS and IMS joint submission to WMT 2014 translating from English into German
Fabienne Cap | Marion Weller | Anita Ramm | Alexander Fraser
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
Distinguishing Degrees of Compositionality in Compound Splitting for Statistical Machine Translation
Marion Weller | Fabienne Cap | Stefan Müller | Sabine Schulte im Walde | Alexander Fraser
Proceedings of the First Workshop on Computational Approaches to Compound Analysis (ComAComA 2014)

pdf bib
Investigating the Usefulness of Generalized Word Representations in SMT
Nadir Durrani | Philipp Koehn | Helmut Schmid | Alexander Fraser
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Combining bilingual terminology mining and morphological modeling for domain adaptation in SMT
Marion Weller | Alexander Fraser | Ulrich Heid
Proceedings of the 17th Annual conference of the European Association for Machine Translation

pdf bib
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: Tutorials
Alex Fraser | Yang Liu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: Tutorials

2013

pdf bib
Model With Minimal Translation Units, But Decode With Phrases
Nadir Durrani | Alexander Fraser | Helmut Schmid
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Munich-Edinburgh-Stuttgart Submissions of OSM Systems at WMT13
Nadir Durrani | Alexander Fraser | Helmut Schmid | Hassan Sajjad | Richárd Farkas
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
QCRI-MES Submission at WMT13: Using Transliteration Mining to Improve Statistical Machine Translation
Hassan Sajjad | Svetlana Smekalova | Nadir Durrani | Alexander Fraser | Helmut Schmid
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
Munich-Edinburgh-Stuttgart Submissions at WMT13: Morphological and Syntactic Processing for SMT
Marion Weller | Max Kisselew | Svetlana Smekalova | Alexander Fraser | Helmut Schmid | Nadir Durrani | Hassan Sajjad | Richárd Farkas
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
Using subcategorization knowledge to improve case prediction for translation to German
Marion Weller | Alexander Fraser | Sabine Schulte im Walde
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT?
Nadir Durrani | Alexander Fraser | Helmut Schmid | Hieu Hoang | Philipp Koehn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Knowledge Sources for Constituent Parsing of German, a Morphologically Rich and Less-Configurational Language
Alexander Fraser | Helmut Schmid | Richárd Farkas | Renjing Wang | Hinrich Schütze
Computational Linguistics, Volume 39, Issue 1 - March 2013

pdf bib
Improving Translation to Morphologically Rich Languages (Améliorer la traduction des langages morphologiquement riches) [in French]
Alexander Fraser
Proceedings of TALN 2013 (Volume 4: Invited Conferences)

2012

pdf bib
Long-distance reordering during search for hierarchical phrase-based SMT
Fabienne Braune | Anita Gojun | Alexander Fraser
Proceedings of the 16th Annual conference of the European Association for Machine Translation

pdf bib
Domain Adaptation in Machine Translation: Findings from the 2012 Johns Hopkins Summer Workshop
Hal Daumé III | Marine Carpuat | Alex Fraser | Chris Quirk
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Keynote Presentations

pdf bib
A Statistical Model for Unsupervised and Semi-supervised Transliteration Mining
Hassan Sajjad | Alexander Fraser | Helmut Schmid
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Modeling Inflection and Word-Formation in SMT
Alexander Fraser | Marion Weller | Aoife Cahill | Fabienne Cap
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Determining the placement of German verbs in English–to–German SMT
Anita Gojun | Alexander Fraser
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

2011

pdf bib
Comparing Two Techniques for Learning Transliteration Models Using a Parallel Corpus
Hassan Sajjad | Nadir Durrani | Helmut Schmid | Alexander Fraser
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
An Algorithm for Unsupervised Transliteration Mining with an Application to Word Alignment
Hassan Sajjad | Alexander Fraser | Helmut Schmid
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
A Joint Sequence Translation Model with Integrated Reordering
Nadir Durrani | Helmut Schmid | Alexander Fraser
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Bitext-Based Resolution of German Subject-Object Ambiguities
Florian Schwarck | Alexander Fraser | Hinrich Schütze
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
How to Avoid Burning Ducks: Combining Linguistic Analysis and Corpus Statistics for German Compound Processing
Fabienne Fritzinger | Alexander Fraser
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib
Hindi-to-Urdu Machine Translation through Transliteration
Nadir Durrani | Hassan Sajjad | Alexander Fraser | Helmut Schmid
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Improved Unsupervised Sentence Alignment for Symmetrical and Asymmetrical Parallel Corpora
Fabienne Braune | Alexander Fraser
Coling 2010: Posters

2009

pdf bib
Word Alignment by Thresholded Two-Dimensional Normalization
Hamidreza Kobdani | Alexander Fraser | Hinrich Schütze
Proceedings of Machine Translation Summit XII: Posters

pdf bib
Rich Bitext Projection Features for Parse Reranking
Alexander Fraser | Renjing Wang | Hinrich Schütze
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Experiments in Morphosyntactic Processing for Translating to and from German
Alexander Fraser
Proceedings of the Fourth Workshop on Statistical Machine Translation

2007

pdf bib
Squibs and Discussions: Measuring Word Alignment Quality for Statistical Machine Translation
Alexander Fraser | Daniel Marcu
Computational Linguistics, Volume 33, Number 3, September 2007

pdf bib
Getting the Structure Right for Word Alignment: LEAF
Alexander Fraser | Daniel Marcu
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Semi-Supervised Training for Statistical Word Alignment
Alexander Fraser | Daniel Marcu
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2005

pdf bib
ISI‘s Participation in the Romanian-English Alignment Task
Alexander Fraser | Daniel Marcu
Proceedings of the ACL Workshop on Building and Using Parallel Texts

2004

pdf bib
Language Weaver Arabic->English MT
Daniel Marcu | Alex Fraser | William Wong | Kevin Knight
Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages

pdf bib
A Smorgasbord of Features for Statistical Machine Translation
Franz Josef Och | Daniel Gildea | Sanjeev Khudanpur | Anoop Sarkar | Kenji Yamada | Alex Fraser | Shankar Kumar | Libin Shen | David Smith | Katherine Eng | Viren Jain | Zhen Jin | Dragomir Radev
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004

pdf bib
Improved Machine Translation Performance via Parallel Sentence Extraction from Comparable Corpora
Dragos Stefan Munteanu | Alexander Fraser | Daniel Marcu
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004

2003

bib
Issues in Arabic MT
Alex Fraser
Workshop on Machine Translation for Semitic languages: issues and approaches

Search
Co-authors