Eleftheria Briakou


pdf bib
BitextEdit: Automatic Bitext Editing for Improved Low-Resource Machine Translation
Eleftheria Briakou | Sida Wang | Luke Zettlemoyer | Marjan Ghazvininejad
Findings of the Association for Computational Linguistics: NAACL 2022

Mined bitexts can contain imperfect translations that yield unreliable training signals for Neural Machine Translation (NMT). While filtering such pairs out is known to improve final model quality, we argue that it is suboptimal in low-resource conditions where even mined data can be limited. In our work, we propose instead, to refine the mined bitexts via automatic editing: given a sentence in a language xf, and a possibly imperfect translation of it xe, our model generates a revised version xf' or xe' that yields a more equivalent translation pair (i.e., <xf, xe'> or <xf', xe>). We use a simple editing strategy by (1) mining potentially imperfect translations for each sentence in a given bitext, (2) learning a model to reconstruct the original translations and translate, in a multi-task fashion. Experiments demonstrate that our approach successfully improves the quality of CCMatrix mined bitext for 5 low-resource language-pairs and 10 translation directions by up to 8 BLEU points, in most cases improving upon a competitive translation-based baseline.

pdf bib
Can Synthetic Translations Improve Bitext Quality?
Eleftheria Briakou | Marine Carpuat
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Synthetic translations have been used for a wide range of NLP tasks primarily as a means of data augmentation. This work explores, instead, how synthetic translations can be used to revise potentially imperfect reference translations in mined bitext. We find that synthetic samples can improve bitext quality without any additional bilingual supervision when they replace the originals based on a semantic equivalence classifier that helps mitigate NMT noise. The improved quality of the revised bitext is confirmed intrinsically via human evaluation and extrinsically through bilingual induction and MT tasks.


pdf bib
Olá, Bonjour, Salve! XFORMAL: A Benchmark for Multilingual Formality Style Transfer
Eleftheria Briakou | Di Lu | Ke Zhang | Joel Tetreault
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

We take the first step towards multilingual style transfer by creating and releasing XFORMAL, a benchmark of multiple formal reformulations of informal text in Brazilian Portuguese, French, and Italian. Results on XFORMAL suggest that state-of-the-art style transfer approaches perform close to simple baselines, indicating that style transfer is even more challenging when moving multilingual.

pdf bib
A Review of Human Evaluation for Style Transfer
Eleftheria Briakou | Sweta Agrawal | Ke Zhang | Joel Tetreault | Marine Carpuat
Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021)

This paper reviews and summarizes human evaluation practices described in 97 style transfer papers with respect to three main evaluation aspects: style transfer, meaning preservation, and fluency. In principle, evaluations by human raters should be the most reliable. However, in style transfer papers, we find that protocols for human evaluations are often underspecified and not standardized, which hampers the reproducibility of research in this field and progress toward better human and automatic evaluation methods.

pdf bib
Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences on Neural Machine Translation
Eleftheria Briakou | Marine Carpuat
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

While it has been shown that Neural Machine Translation (NMT) is highly sensitive to noisy parallel training samples, prior work treats all types of mismatches between source and target as noise. As a result, it remains unclear how samples that are mostly equivalent but contain a small number of semantically divergent tokens impact NMT training. To close this gap, we analyze the impact of different types of fine-grained semantic divergences on Transformer models. We show that models trained on synthetic divergences output degenerated text more frequently and are less confident in their predictions. Based on these findings, we introduce a divergent-aware NMT framework that uses factors to help NMT recover from the degradation caused by naturally occurring divergences, improving both translation quality and model calibration on EN-FR tasks.

pdf bib
Evaluating the Evaluation Metrics for Style Transfer: A Case Study in Multilingual Formality Transfer
Eleftheria Briakou | Sweta Agrawal | Joel Tetreault | Marine Carpuat
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

While the field of style transfer (ST) has been growing rapidly, it has been hampered by a lack of standardized practices for automatic evaluation. In this paper, we evaluate leading automatic metrics on the oft-researched task of formality style transfer. Unlike previous evaluations, which focus solely on English, we expand our focus to Brazilian-Portuguese, French, and Italian, making this work the first multilingual evaluation of metrics in ST. We outline best practices for automatic evaluation in (formality) style transfer and identify several models that correlate well with human judgments and are robust across languages. We hope that this work will help accelerate development in ST, where human evaluation is often challenging to collect.


pdf bib
Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank
Eleftheria Briakou | Marine Carpuat
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Detecting fine-grained differences in content conveyed in different languages matters for cross-lingual NLP and multilingual corpora analysis, but it is a challenging machine learning problem since annotation is expensive and hard to scale. This work improves the prediction and annotation of fine-grained semantic divergences. We introduce a training strategy for multilingual BERT models by learning to rank synthetic divergent examples of varying granularity. We evaluate our models on the Rationalized English-French Semantic Divergences, a new dataset released with this work, consisting of English-French sentence-pairs annotated with semantic divergence classes and token-level rationales. Learning to rank helps detect fine-grained sentence-level divergences more accurately than a strong sentence-level similarity model, while token-level predictions have the potential of further distinguishing between coarse and fine-grained divergences.


pdf bib
The University of Maryland’s Kazakh-English Neural Machine Translation System at WMT19
Eleftheria Briakou | Marine Carpuat
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

This paper describes the University of Maryland’s submission to the WMT 2019 Kazakh-English news translation task. We study the impact of transfer learning from another low-resource but related language. We experiment with different ways of encoding lexical units to maximize lexical overlap between the two language pairs, as well as back-translation and ensembling. The submitted system improves over a Kazakh-only baseline by +5.45 BLEU on newstest2019.

pdf bib
Cross-Topic Distributional Semantic Representations Via Unsupervised Mappings
Eleftheria Briakou | Nikos Athanasiou | Alexandros Potamianos
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

In traditional Distributional Semantic Models (DSMs) the multiple senses of a polysemous word are conflated into a single vector space representation. In this work, we propose a DSM that learns multiple distributional representations of a word based on different topics. First, a separate DSM is trained for each topic and then each of the topic-based DSMs is aligned to a common vector space. Our unsupervised mapping approach is motivated by the hypothesis that words preserving their relative distances in different topic semantic sub-spaces constitute robust semantic anchors that define the mappings between them. Aligned cross-topic representations achieve state-of-the-art results for the task of contextual word similarity. Furthermore, evaluation on NLP downstream tasks shows that multiple topic-based embeddings outperform single-prototype models.