Loic Barrault - ACL Anthology

Loic Barrault

Also published as: Loïc Barrault

2025

MEXMA: Token-level objectives improve sentence representations
João Maria Janeiro | Benjamin Piwowarski | Patrick Gallinari | Loic Barrault
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Cross-lingual sentence encoders (CLSE) create fixed-size sentence representations with aligned translations. Current pre-trained CLSE approaches use sentence-level objectives only. This can lead to loss of information, especially for tokens, which then degrades the sentence representation. We propose MEXMA, a novel approach that integrates both sentence-level and token-level objectives. The sentence representation in one language is used to predict masked tokens in another language, with both the sentence representation and *all tokens directly update the encoder*. We show that adding token-level objectives greatly improves the sentence representation quality across several tasks. Our approach outperforms current pre-trained cross-lingual sentence encoders on bitext mining as well as several downstream tasks. We also analyse the information encoded in our tokens, and how the sentence representation is built from them.

Mixture of Languages: Improved Multilingual Encoders Through Language Grouping
João Maria Janeiro | Belen Alastruey | Francisco Massa | Maha Elbayad | Benjamin Piwowarski | Patrick Gallinari | Loic Barrault
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

We propose Mixture of Languages (MoL), a new strategy to pretrain largely multilingual encoders. Recent work in this field has relied on training transformer encoders on a large amount of multilingual data, with all parameters shared across all languages, without studying how to optimally balance language transfer and interference to achieve better performance. To address this, MoL proposes to group languages based on their similarity, and add parallel, sparsely activated layers that process each group independently. This architecture allows MoL to boost language transfer while minimizing interference, without increasing the active parameter count. We show that MoL largely outperforms a dense counterpart trained with the same configuration, as well as MoE models and public multilingual encoders such as XLM-R or mBERT on downstream tasks.

2024

Aligning Speech Segments Beyond Pure Semantics
Kevin Heffernan | Artyom Kozhevnikov | Loic Barrault | Alexandre Mourachko | Holger Schwenk
Findings of the Association for Computational Linguistics: ACL 2024

Multilingual parallel data for speech-to-speech translation is scarce and expensive to create from scratch. This is all the more true for expressive speech translation, which aims at preserving not only the semantics, but also the overall prosody (e.g. style, emotion, rate-of-speech). Existing corpora contain speech utterances with the same meaning, yet the overall prosody is typically different, as human annotators are not tasked with reproducing these aspects, or crowed-sourced efforts do not specifically target this kind of alignment in priority. In this paper, we propose a novel alignment algorithm, which automatically forms pairs of speech segments aligned not only in meaning, but also in expressivity. In order to validate our approach, we train an expressive multilingual speech-to-speech translation system on the automatically aligned data. Our experiments show that in comparison to semantic-only approaches, expressively aligned data yields large improvements in source expressivity preservation (e.g. 43% uplift in speech rate preservation on average), while still maintaining content translation quality. In some scenarios, results also indicate that this alignment algorithm can outperform standard, semantic-focused approaches even on content translation quality.

Speech Data from Radio Broadcasts for Low Resource Languages
Bismarck Bamfo Odoom | Paola Leibny Garcia | Prangthip Hansanti | Loïc Barrault | Christophe Ropers | Matthew Wiesner | Kenton Murray | Alex Mourachko | Philipp Koehn
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)

We created a collection of speech data for 48 low resource languages. The corpus is extracted from radio broadcasts and processed with novel speech detection and language identification models based on a manually vetted subset of the audio for 10 languages. The data is made publicly available.

2023

Detecting and Mitigating Hallucinations in Machine Translation: Model Internal Workings Alone Do Well, Sentence Similarity Even Better
David Dale | Elena Voita | Loic Barrault | Marta R. Costa-jussà
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

While the problem of hallucinations in neural machine translation has long been recognized, so far the progress on its alleviation is very little. Indeed, recently it turned out that without artificially encouraging models to hallucinate, previously existing methods fall short and even the standard sequence log-probability is more informative. It means that internal characteristics of the model can give much more information than we expect, and before using external models and measures, we first need to ask: how far can we go if we use nothing but the translation model itself ? We propose to use a method that evaluates the percentage of the source contribution to a generated translation. Intuitively, hallucinations are translations “detached” from the source, hence they can be identified by low source contribution. This method improves detection accuracy for the most severe hallucinations by a factor of 2 and is able to alleviate hallucinations at test time on par with the previous best approach that relies on external models. Next, if we move away from internal model characteristics and allow external tools, we show that using sentence similarity from cross-lingual embeddings further improves these results. We release the code of our experiments.

Metaphor Detection with Effective Context Denoising
Shun Wang | Yucheng Li | Chenghua Lin | Loic Barrault | Frank Guerin
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

We propose a novel RoBERTa-based model, RoPPT, which introduces a target-oriented parse tree structure in metaphor detection. Compared to existing models, RoPPT focuses on semantically relevant information and achieves the state-of-the-art on several main metaphor datasets. We also compare our approach against several popular denoising and pruning methods, demonstrating the effectiveness of our approach in context denoising. Our code and dataset can be found at https://github.com/MajiBear000/RoPPT.

FrameBERT: Conceptual Metaphor Detection with Frame Embedding Learning
Yucheng Li | Shun Wang | Chenghua Lin | Frank Guerin | Loic Barrault
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

In this paper, we propose FrameBERT, a BERT-based model that can explicitly learn and incorporate FrameNet Embeddings for concept-level metaphor detection. FrameBERT not only achieves better or comparable performance to the state-of-the-art, but also is more explainable and interpretable compared to existing models, attributing to its ability of accounting for external knowledge of FrameNet.

HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation
David Dale | Elena Voita | Janice Lam | Prangthip Hansanti | Christophe Ropers | Elahe Kalbassi | Cynthia Gao | Loïc Barrault | Marta R. Costa-jussà
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Hallucinations in machine translation are translations that contain information completely unrelated to the input. Omissions are translations that do not include some of the input information. While both cases tend to be catastrophic errors undermining user trust, annotated data with these types of pathologies is extremely scarce and is limited to a few high-resource languages. In this work, we release an annotated dataset for the hallucination and omission phenomena covering 18 translation directions with varying resource levels and scripts. Our annotation covers different levels of partial and full hallucinations as well as omissions both at the sentence and at the word level. Additionally, we revisit previous methods for hallucination and omission detection, show that conclusions made based on a single language pair largely do not hold for a large-scale evaluation, and establish new solid baselines.

We Need to Talk About Classification Evaluation Metrics in NLP
Peter Vickers | Loic Barrault | Emilio Monti | Nikolaos Aletras
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

2022

On the Importance of Effectively Adapting Pretrained Language Models for Active Learning
Katerina Margatina | Loic Barrault | Nikolaos Aletras
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Recent active learning (AL) approaches in Natural Language Processing (NLP) proposed using off-the-shelf pretrained language models (LMs). In this paper, we argue that these LMs are not adapted effectively to the downstream task during AL and we explore ways to address this issue. We suggest to first adapt the pretrained LM to the target task by continuing training with all the available unlabeled data and then use it for AL. We also propose a simple yet effective fine-tuning method to ensure that the adapted LM is properly trained in both low and high resource scenarios during AL. Our experiments demonstrate that our approach provides substantial data efficiency improvements compared to the standard fine-tuning approach, suggesting that a poor training strategy can be catastrophic for AL.

Controlling Extra-Textual Attributes about Dialogue Participants: A Case Study of English-to-Polish Neural Machine Translation
Sebastian T. Vincent | Loïc Barrault | Carolina Scarton
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

Unlike English, morphologically rich languages can reveal characteristics of speakers or their conversational partners, such as gender and number, via pronouns, morphological endings of words and syntax. When translating from English to such languages, a machine translation model needs to opt for a certain interpretation of textual context, which may lead to serious translation errors if extra-textual information is unavailable. We investigate this challenge in the English-to-Polish language direction. We focus on the underresearched problem of utilising external metadata in automatic translation of TV dialogue, proposing a case study where a wide range of approaches for controlling attributes in translation is employed in a multi-attribute scenario. The best model achieves an improvement of +5.81 chrF++/+6.03 BLEU, with other models achieving competitive performance. We additionally contribute a novel attribute-annotated dataset of Polish TV dialogue and a morphological analysis script used to evaluate attribute control in models.

Findings of the IWSLT 2022 Evaluation Campaign
Antonios Anastasopoulos | Loïc Barrault | Luisa Bentivogli | Marcely Zanon Boito | Ondřej Bojar | Roldano Cattoni | Anna Currey | Georgiana Dinu | Kevin Duh | Maha Elbayad | Clara Emmanuel | Yannick Estève | Marcello Federico | Christian Federmann | Souhir Gahbiche | Hongyu Gong | Roman Grundkiewicz | Barry Haddow | Benjamin Hsu | Dávid Javorský | Vĕra Kloudová | Surafel Lakew | Xutai Ma | Prashant Mathur | Paul McNamee | Kenton Murray | Maria Nǎdejde | Satoshi Nakamura | Matteo Negri | Jan Niehues | Xing Niu | John Ortega | Juan Pino | Elizabeth Salesky | Jiatong Shi | Matthias Sperber | Sebastian Stüker | Katsuhito Sudoh | Marco Turchi | Yogesh Virkar | Alexander Waibel | Changhan Wang | Shinji Watanabe
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation. A total of 27 teams participated in at least one of the shared tasks. This paper details, for each shared task, the purpose of the task, the data that were released, the evaluation metrics that were applied, the submissions that were received and the results that were achieved.

ON-TRAC Consortium Systems for the IWSLT 2022 Dialect and Low-resource Speech Translation Tasks
Marcely Zanon Boito | John Ortega | Hugo Riguidel | Antoine Laurent | Loïc Barrault | Fethi Bougares | Firas Chaabani | Ha Nguyen | Florentin Barbier | Souhir Gahbiche | Yannick Estève
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)

This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2022: low-resource and dialect speech translation. For the Tunisian Arabic-English dataset (low-resource and dialect tracks), we build an end-to-end model as our joint primary submission, and compare it against cascaded models that leverage a large fine-tuned wav2vec 2.0 model for ASR. Our results show that in our settings pipeline approaches are still very competitive, and that with the use of transfer learning, they can outperform end-to-end models for speech translation (ST). For the Tamasheq-French dataset (low-resource track) our primary submission leverages intermediate representations from a wav2vec 2.0 model trained on 234 hours of Tamasheq audio, while our contrastive model uses a French phonetic transcription of the Tamasheq audio as input in a Conformer speech translation architecture jointly trained on automatic speech recognition, ST and machine translation losses. Our results highlight that self-supervised models trained on smaller sets of target data are more effective to low-resource end-to-end ST fine-tuning, compared to large off-the-shelf models. Results also illustrate that even approximate phonetic transcriptions can improve ST scores.

Controlling Formality in Low-Resource NMT with Domain Adaptation and Re-Ranking: SLT-CDT-UoS at IWSLT2022
Sebastian Vincent | Loïc Barrault | Carolina Scarton
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)

This paper describes the SLT-CDT-UoS group’s submission to the first Special Task on Formality Control for Spoken Language Translation, part of the IWSLT 2022 Evaluation Campaign. Our efforts were split between two fronts: data engineering and altering the objective function for best hypothesis selection. We used language-independent methods to extract formal and informal sentence pairs from the provided corpora; using English as a pivot language, we propagated formality annotations to languages treated as zero-shot in the task; we also further improved formality controlling with a hypothesis re-ranking approach. On the test sets for English-to-German and English-to-Spanish, we achieved an average accuracy of .935 within the constrained setting and .995 within unconstrained setting. In a zero-shot setting for English-to-Russian and English-to-Italian, we scored average accuracy of .590 for constrained setting and .659 for unconstrained.

Speech Resources in the Tamasheq Language
Marcely Zanon Boito | Fethi Bougares | Florentin Barbier | Souhir Gahbiche | Loïc Barrault | Mickael Rouvier | Yannick Estève
Proceedings of the Thirteenth Language Resources and Evaluation Conference

In this paper we present two datasets for Tamasheq, a developing language mainly spoken in Mali and Niger. These two datasets were made available for the IWSLT 2022 low-resource speech translation track, and they consist of collections of radio recordings from daily broadcast news in Niger (Studio Kalangou) and Mali (Studio Tamani). We share (i) a massive amount of unlabeled audio data (671 hours) in five languages: French from Niger, Fulfulde, Hausa, Tamasheq and Zarma, and (ii) a smaller 17 hours parallel corpus of audio recordings in Tamasheq, with utterance-level translations in the French language. All this data is shared under the Creative Commons BY-NC-ND 3.0 license. We hope these resources will inspire the speech community to develop and benchmark models using the Tamasheq language.

2021

In Factuality: Efficient Integration of Relevant Facts for Visual Question Answering
Peter Vickers | Nikolaos Aletras | Emilio Monti | Loïc Barrault
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Visual Question Answering (VQA) methods aim at leveraging visual input to answer questions that may require complex reasoning over entities. Current models are trained on labelled data that may be insufficient to learn complex knowledge representations. In this paper, we propose a new method to enhance the reasoning capabilities of a multi-modal pretrained model (Vision+Language BERT) by integrating facts extracted from an external knowledge base. Evaluation on the KVQA dataset benchmark demonstrates that our method outperforms competitive baselines by 19%, achieving new state-of-the-art results. We also perform an extensive analysis highlighting the limitations of our best performing model through an ablation study.

Active Learning by Acquiring Contrastive Examples
Katerina Margatina | Giorgos Vernikos | Loïc Barrault | Nikolaos Aletras
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Common acquisition functions for active learning use either uncertainty or diversity sampling, aiming to select difficult and diverse data points from the pool of unlabeled data, respectively. In this work, leveraging the best of both worlds, we propose an acquisition function that opts for selecting contrastive examples, i.e. data points that are similar in the model feature space and yet the model outputs maximally different predictive likelihoods. We compare our approach, CAL (Contrastive Active Learning), with a diverse set of acquisition functions in four natural language understanding tasks and seven datasets. Our experiments show that CAL performs consistently better or equal than the best performing baseline across all tasks, on both in-domain and out-of-domain data. We also conduct an extensive ablation study of our method and we further analyze all actively acquired datasets showing that CAL achieves a better trade-off between uncertainty and diversity compared to other strategies.

2020

Simultaneous Machine Translation with Visual Context
Ozan Caglayan | Julia Ive | Veneta Haralampieva | Pranava Madhyastha | Loïc Barrault | Lucia Specia
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible. The translation thus has to start with an incomplete source text, which is read progressively, creating the need for anticipation. In this paper, we seek to understand whether the addition of visual information can compensate for the missing source context. To this end, we analyse the impact of different multimodal approaches and visual features on state-of-the-art SiMT frameworks. Our results show that visual context is helpful and that visually-grounded models based on explicit object region information are much better than commonly used global features, reaching up to 3 BLEU points improvement under low latency scenarios. Our qualitative analysis illustrates cases where only the multimodal systems are able to translate correctly from English into gender-marked languages, as well as deal with differences in word order, such as adjective-noun placement between English and French.

Évaluation de systèmes apprenant tout au long de la vie (Evaluation of lifelong learning systems )
Yevhenii Prokopalo | Sylvain Meignier | Olivier Galibert | Loïc Barrault | Anthony Larcher
Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d'Études sur la Parole

Aujourd’hui les systèmes intelligents obtiennent d’excellentes performances dans de nombreux domaines lorsqu’ils sont entraînés par des experts en apprentissage automatique. Lorsque ces systèmes sont mis en production, leurs performances se dégradent au cours du temps du fait de l’évolution de leur environnement réel. Une adaptation de leur modèle par des experts en apprentissage automatique est possible mais très coûteuse alors que les sociétés utilisant ces systèmes disposent d’experts du domaine qui pourraient accompagner ces systèmes dans un apprentissage tout au long de la vie. Dans cet article nous proposons un cadre d’évaluation générique pour des systèmes apprenant tout au long de la vie (SATLV). Nous proposons d’évaluer l’apprentissage assisté par l’humain (actif ou interactif) et l’apprentissage au cours du temps.

Traduction automatique pour la normalisation du français du XVIIe siècle ()
Simon Gabay | Loïc Barrault
Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 2 : Traitement Automatique des Langues Naturelles

Evaluation of Lifelong Learning Systems
Yevhenii Prokopalo | Sylvain Meignier | Olivier Galibert | Loic Barrault | Anthony Larcher
Proceedings of the Twelfth Language Resources and Evaluation Conference

Current intelligent systems need the expensive support of machine learning experts to sustain their performance level when used on a daily basis. To reduce this cost, i.e. remaining free from any machine learning expert, it is reasonable to implement lifelong (or continuous) learning intelligent systems that will continuously adapt their model when facing changing execution conditions. In this work, the systems are allowed to refer to human domain experts who can provide the system with relevant knowledge about the task. Nowadays, the fast growth of lifelong learning systems development rises the question of their evaluation. In this article we propose a generic evaluation methodology for the specific case of lifelong learning systems. Two steps will be considered. First, the evaluation of human-assisted learning (including active and/or interactive learning) outside the context of lifelong learning. Second, the system evaluation across time, with propositions of how a lifelong learning intelligent system should be evaluated when including human assisted learning or not.

This paper presents the results of the news translation task and the similar language translation task, both organised alongside the Conference on Machine Translation (WMT) 2020. In the news task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting mainly of news stories. The task was also opened up to additional test suites to probe specific aspects of translation. In the similar language translation task, participants built machine translation systems for translating between closely related pairs of languages.

Findings of the First Shared Task on Lifelong Learning Machine Translation
Loïc Barrault | Magdalena Biesialska | Marta R. Costa-jussà | Fethi Bougares | Olivier Galibert
Proceedings of the Fifth Conference on Machine Translation

A lifelong learning system can adapt to new data without forgetting previously acquired knowledge. In this paper, we introduce the first benchmark for lifelong learning machine translation. For this purpose, we provide training, lifelong and test data sets for two language pairs: English-German and English-French. Additionally, we report the results of our baseline systems, which we make available to the public. The goal of this shared task is to encourage research on the emerging topic of lifelong learning machine translation.

2019

The IWSLT 2019 Evaluation Campaign
Jan Niehues | Rolando Cattoni | Sebastian Stüker | Matteo Negri | Marco Turchi | Thanh-Le Ha | Elizabeth Salesky | Ramon Sanabria | Loic Barrault | Lucia Specia | Marcello Federico
Proceedings of the 16th International Conference on Spoken Language Translation

The IWSLT 2019 evaluation campaign featured three tasks: speech translation of (i) TED talks and (ii) How2 instructional videos from English into German and Portuguese, and (iii) text translation of TED talks from English into Czech. For the first two tasks we encouraged submissions of end- to-end speech-to-text systems, and for the second task participants could also use the video as additional input. We received submissions by 12 research teams. This overview provides detailed descriptions of the data and evaluation conditions of each task and reports results of the participating systems.

Étude de l’apprentissage par transfert de systèmes de traduction automatique neuronaux (Study on transfer learning in neural machine translation )
Adrien Bardet | Fethi Bougares | Loïc Barrault
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume II : Articles courts

L’apprentissage par transfert est une solution au problème de l’apprentissage de systèmes de traduction automatique neuronaux pour des paires de langues peu dotées. Dans cet article, nous proposons une analyse de cette méthode. Nous souhaitons évaluer l’impact de la quantité de données et celui de la proximité des langues impliquées pour obtenir le meilleur transfert possible. Nous prenons en compte ces deux paramètres non seulement pour une tâche de traduction “classique” mais également lorsque les corpus de données font défaut. Enfin, il s’agit de proposer une approche où volume de données et proximité des langues sont combinées afin de ne plus avoir à trancher entre ces deux éléments.

Probing the Need for Visual Context in Multimodal Machine Translation
Ozan Caglayan | Pranava Madhyastha | Lucia Specia | Loïc Barrault
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Current work on multimodal machine translation (MMT) has suggested that the visual modality is either unnecessary or only marginally beneficial. We posit that this is a consequence of the very simple, short and repetitive sentences used in the only available dataset for the task (Multi30K), rendering the source text sufficient as context. In the general case, however, we believe that it is possible to combine visual and textual information in order to ground translations. In this paper we probe the contribution of the visual modality to state-of-the-art MMT models by conducting a systematic analysis where we partially deprive the models from source-side textual context. Our results show that under limited textual context, models are capable of leveraging the visual input to generate better translations. This contradicts the current belief that MMT models disregard the visual modality because of either the quality of the image features or the way they are integrated into the model.

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019. Participants were asked to build machine translation systems for any of 18 language pairs, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. The task was also opened up to additional test suites to probe specific aspects of translation.

LIUM’s Contributions to the WMT2019 News Translation Task: Data and Systems for German-French Language Pairs
Fethi Bougares | Jane Wottawa | Anne Baillot | Loïc Barrault | Adrien Bardet
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

This paper describes the neural machine translation (NMT) systems of the LIUM Laboratory developed for the French↔German news translation task of the Fourth Conference onMachine Translation (WMT 2019). The chosen language pair is included for the first time in the WMT news translation task. We de-scribe how the training and the evaluation data was created. We also present our participation in the French↔German translation directions using self-attentional Transformer networks with small and big architectures.

2018

What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties
Alexis Conneau | German Kruszewski | Guillaume Lample | Loïc Barrault | Marco Baroni
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Although much effort has recently been devoted to training high-quality sentence embeddings, we still have a poor understanding of what they are capturing. “Downstream” tasks, often based on sentence classification, are commonly used to evaluate the quality of sentence representations. The complexity of the tasks makes it however difficult to infer what kind of information is present in the representations. We introduce here 10 probing tasks designed to capture simple linguistic features of sentences, and we use them to study embeddings generated by three different encoders trained in eight distinct ways, uncovering intriguing properties of both encoders and training methods.

Findings of the Third Shared Task on Multimodal Machine Translation
Loïc Barrault | Fethi Bougares | Lucia Specia | Chiraag Lala | Desmond Elliott | Stella Frank
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

We present the results from the third shared task on multimodal machine translation. In this task a source sentence in English is supplemented by an image and participating systems are required to generate a translation for such a sentence into German, French or Czech. The image can be used in addition to (or instead of) the source sentence. This year the task was extended with a third target language (Czech) and a new test set. In addition, a variant of this task was introduced with its own test set where the source sentence is given in multiple languages: English, French and German, and participating systems are required to generate a translation in Czech. Seven teams submitted 45 different systems to the two variants of the task. Compared to last year, the performance of the multimodal submissions improved, but text-only systems remain competitive.

LIUM-CVC Submissions for WMT18 Multimodal Translation Task
Ozan Caglayan | Adrien Bardet | Fethi Bougares | Loïc Barrault | Kai Wang | Marc Masana | Luis Herranz | Joost van de Weijer
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

This paper describes the multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT18 Shared Task on Multimodal Translation. This year we propose several modifications to our previous multimodal attention architecture in order to better integrate convolutional features and refine them using encoder-side information. Our final constrained submissions ranked first for English→French and second for English→German language pairs among the constrained submissions according to the automatic evaluation metric METEOR.

2017

Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
Alexis Conneau | Douwe Kiela | Holger Schwenk | Loïc Barrault | Antoine Bordes
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features. Efforts to obtain embeddings for larger chunks of text, such as sentences, have however not been so successful. Several attempts at learning unsupervised representations of sentences have not reached satisfactory enough performance to be widely adopted. In this paper, we show how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks. Much like how computer vision uses ImageNet to obtain features, which can then be transferred to other tasks, our work tends to indicate the suitability of natural language inference for transfer learning to other NLP tasks. Our encoder is publicly available.

Very Deep Convolutional Networks for Text Classification
Alexis Conneau | Holger Schwenk | Loïc Barrault | Yann Lecun
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

The dominant approach for many NLP tasks are recurrent neural networks, in particular LSTMs, and convolutional neural networks. However, these architectures are rather shallow in comparison to the deep convolutional networks which have pushed the state-of-the-art in computer vision. We present a new architecture (VDCNN) for text processing which operates directly at the character level and uses only small convolutions and pooling operations. We are able to show that the performance of this model increases with the depth: using up to 29 convolutional layers, we report improvements over the state-of-the-art on several public text classification tasks. To the best of our knowledge, this is the first time that very deep convolutional nets have been applied to text processing.

Word Representations in Factored Neural Machine Translation
Franck Burlot | Mercedes García-Martínez | Loïc Barrault | Fethi Bougares | François Yvon
Proceedings of the Second Conference on Machine Translation

Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description
Desmond Elliott | Stella Frank | Loïc Barrault | Fethi Bougares | Lucia Specia
Proceedings of the Second Conference on Machine Translation

LIUM Machine Translation Systems for WMT17 News Translation Task
Mercedes García-Martínez | Ozan Caglayan | Walid Aransa | Adrien Bardet | Fethi Bougares | Loïc Barrault
Proceedings of the Second Conference on Machine Translation

LIUM-CVC Submissions for WMT17 Multimodal Translation Task
Ozan Caglayan | Walid Aransa | Adrien Bardet | Mercedes García-Martínez | Fethi Bougares | Loïc Barrault | Marc Masana | Luis Herranz | Joost van de Weijer
Proceedings of the Second Conference on Machine Translation

2016

Factored Neural Machine Translation Architectures
Mercedes García-Martínez | Loïc Barrault | Fethi Bougares
Proceedings of the 13th International Conference on Spoken Language Translation

In this paper we investigate the potential of the neural machine translation (NMT) when taking into consideration the linguistic aspect of target language. From this standpoint, the NMT approach with attention mechanism [1] is extended in order to produce several linguistically derived outputs. We train our model to simultaneously output the lemma and its corresponding factors (e.g. part-of-speech, gender, number). The word level translation is built with a mapping function using a priori linguistic information. Compared to the standard NMT system, factored architecture increases significantly the vocabulary coverage while decreasing the number of unknown words. With its richer architecture, the Factored NMT approach allows us to implement several training setup that will be discussed in detail along this paper. On the IWSLT’15 English-to-French task, FNMT model outperforms NMT model in terms of BLEU score. A qualitative analysis of the output on a set of test sentences shows the effectiveness of the FNMT model.

Does Multimodality Help Human and Machine for Translation and Image Captioning?
Ozan Caglayan | Walid Aransa | Yaxing Wang | Marc Masana | Mercedes García-Martínez | Fethi Bougares | Loïc Barrault | Joost van de Weijer
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

SHEF-LIUM-NN: Sentence level Quality Estimation with Neural Network Features
Kashif Shah | Fethi Bougares | Loïc Barrault | Lucia Specia
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

2015

The LIUM ASR and SLT systems for IWSLT 2015
Mercedes Garcia Martínez | Loïc Barrault | Anthony Rousseau | Paul Deléglise | Yannick Estève
Proceedings of the 12th International Workshop on Spoken Language Translation: Evaluation Campaign

Improving continuous space language models auxiliary features
Walid Aransa | Holger Schwenk | Loïc Barrault
Proceedings of the 12th International Workshop on Spoken Language Translation: Papers

Continuous Adaptation to User Feedback for Statistical Machine Translation
Frédéric Blain | Fethi Bougares | Amir Hazem | Loïc Barrault | Holger Schwenk
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Incremental Adaptation Strategies for Neural Network Language Models
Alex Ter-Sarkisov | Holger Schwenk | Fethi Bougares | Loïc Barrault
Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality

2014

LIUM English-to-French spoken language translation system and the Vecsys/LIUM automatic speech recognition system for Italian language for IWSLT 2014
Anthony Rousseau | Loïc Barrault | Paul Deléglise | Yannick Estève | Holger Schwenk | Samir Bennacef | Armando Muscariello | Stephan Vanni
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the Spoken Language Translation system developed by the LIUM for the IWSLT 2014 evaluation campaign. We participated in two of the proposed tasks: (i) the Automatic Speech Recognition task (ASR) in two languages, Italian with the Vecsys company, and English alone, (ii) the English to French Spoken Language Translation task (SLT). We present the approaches and specificities found in our systems, as well as the results from the evaluation campaign.

Using Hypothesis Selection Based Features for Confusion Network MT System Combination
Sahar Ghannay | Loïc Barrault
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)

2013

Issues in incremental adaptation of statistical MT from human post-edits
Mauro Cettolo | Christophe Servan | Nicola Bertoldi | Marcello Federico | Loïc Barrault | Holger Schwenk
Proceedings of the 2nd Workshop on Post-editing Technology and Practice

Proceedings of RECITAL 2013
Florian Boudin | Loïc Barrault
Proceedings of RECITAL 2013

Multimodal Comparable Corpora as Resources for Extracting Parallel Data: Parallel Phrases Extraction
Haithem Afli | Loïc Barrault | Holger Schwenk
Proceedings of the Sixth International Joint Conference on Natural Language Processing

2012

A General Framework to Weight Heterogeneous Parallel Data for Model Adaptation in Statistical MT
Kashif Shah | Loïc Barrault | Holger Schwenk
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers

The standard procedure to train the translation model of a phrase-based SMT system is to concatenate all available parallel data, to perform word alignment, to extract phrase pairs and to calculate translation probabilities by simple relative frequency. However, parallel data is quite inhomogeneous in many practical applications with respect to several factors like data source, alignment quality, appropriateness to the task, etc. We propose a general framework to take into account these factors during the calculation of the phrase-table, e.g. by better distributing the probability mass of the individual phrase pairs. No additional feature functions are needed. We report results on two well-known tasks: the IWSLT’11 and WMT’11 evaluations, in both conditions translating from English to French. We give detailed results for different functions to weight the bitexts. Our best systems improve a strong baseline by up to one BLEU point without any impact on the computational complexity during training or decoding.

Semi-supervised transliteration mining from parallel and comparable corpora
Walid Aransa | Holger Schwenk | Loic Barrault
Proceedings of the 9th International Workshop on Spoken Language Translation: Papers

Transliteration is the process of writing a word (mainly proper noun) from one language in the alphabet of another language. This process requires mapping the pronunciation of the word from the source language to the closest possible pronunciation in the target language. In this paper we introduce a new semi-supervised transliteration mining method for parallel and comparable corpora. The method is mainly based on a new suggested Three Levels of Similarity (TLS) scores to extract the transliteration pairs. The first level calculates the similarity of of all vowel letters and consonants letters. The second level calculates the similarity of long vowels and vowel letters at beginning and end position of the words and consonants letters. The third level calculates the similarity consonants letters only. We applied our method on Arabic-English parallel and comparable corpora. We evaluated the extracted transliteration pairs using a statistical based transliteration system. This system is built using letters instead or words as tokens. The transliteration system achieves an accuracy of 0.50 and a mean F-score 0.8958 when trained on transliteration pairs extracted from a parallel corpus. The accuracy is 0.30 and the mean F-score 0.84 when we used instead a comparable corpus to automatically extract the transliteration pairs. This shows that the proposed semi-supervised transliteration mining algorithm is effective and can be applied to other language pairs. We also evaluated two segmentation techniques and reported the impact on the transliteration performance.

Traduction automatique à partir de corpus comparables: extraction de phrases parallèles à partir de données comparables multimodales (Automatic Translation from Comparable corpora : extracting parallel sentences from multimodal comparable corpora) [in French]
Haithem Afli | Loïc Barrault | Holger Schwenk
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 2: TALN

LIUM’s SMT Machine Translation Systems for WMT 2012
Christophe Servan | Patrik Lambert | Anthony Rousseau | Holger Schwenk | Loïc Barrault
Proceedings of the Seventh Workshop on Statistical Machine Translation

2011

Parametric Weighting of Parallel Data for Statistical Machine Translation
Kashif Shah | Loïc Barrault | Holger Schwenk
Proceedings of 5th International Joint Conference on Natural Language Processing

MANY improvements for WMT’11
Loïc Barrault
Proceedings of the Sixth Workshop on Statistical Machine Translation

LIUM’s SMT Machine Translation Systems for WMT 2011
Holger Schwenk | Patrik Lambert | Loïc Barrault | Christophe Servan | Sadaf Abdul-Rauf | Haithem Afli | Kashif Shah
Proceedings of the Sixth Workshop on Statistical Machine Translation

2010

LIUM’s statistical machine translation system for IWSLT 2010
Anthony Rousseau | Loïc Barrault | Paul Deléglise | Yannick Estève
Proceedings of the 7th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the two systems developed by the LIUM laboratory for the 2010 IWSLT evaluation campaign. We participated to the new English to French TALK task. We developed two systems, one for each evaluation condition, both being statistical phrase-based systems using the the Moses toolkit. Several approaches were investigated.

MANY: Open Source MT System Combination at WMT’10
Loïc Barrault
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

Translation Model Adaptation by Resampling
Kashif Shah | Loïc Barrault | Holger Schwenk
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

2009

LIUM’s statistical machine translation system for IWSLT 2009
Holger Schwenk | Loïc Barrault | Yannick Estève | Patrik Lambert
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the systems developed by the LIUM laboratory for the 2009 IWSLT evaluation. We participated in the Arabic and Chinese to English BTEC tasks. We developed three different systems: a statistical phrase-based system using the Moses toolkit, an Statistical Post-Editing system and a hierarchical phrase-based system based on Joshua. A continuous space language model was deployed to improve the modeling of the target language. These systems are combined by a confusion network based approach.

SMT and SPE Machine Translation Systems for WMT‘09
Holger Schwenk | Sadaf Abdul-Rauf | Loïc Barrault | Jean Senellart
Proceedings of the Fourth Workshop on Statistical Machine Translation

Co-authors

Ondřej Bojar 6

Ozan Caglayan 6

Christian Federmann 6

Mercedes García-Martínez 6

Adrien Bardet 5

Yvette Graham 5

Matthias Huck 5

Christof Monz 5

Nikolaos Aletras 4

Marcello Federico 4

Roman Grundkiewicz 4

Makoto Morishita 4

Anthony Rousseau 4

Rajen Chatterjee 3

Alexis Conneau 3

Paul Deléglise 3

Alexander Fraser 3

Souhir Gahbiche 3

Olivier Galibert 3

Antonio Jimeno Yepes 3

Patrik Lambert 3

André F. T. Martins 3

Masaaki Nagata 3

Toshiaki Nakazawa 3

Carolina Scarton 3

Christophe Servan 3

Marcos Zampieri 3

Marcely Zanon Boito 3

Joost van de Weijer 3

Sadaf Abdul-Rauf 2

Florentin Barbier 2

Nicola Bertoldi 2

Magdalena Biesialska 2

Frédéric Blain 2

Mauro Cettolo 2

Desmond Elliott 2

Markus Freitag 2

Patrick Gallinari 2

Prangthip Hansanti 2

João Maria Janeiro 2

Anthony Larcher 2

Pranava Swaroop Madhyastha 2

Katerina Margatina 2

Sylvain Meignier 2

Kenton Murray 2

Benjamin Piwowarski 2

Yevhenii Prokopalo 2

Christophe Ropers 2

Elizabeth Salesky 2

Sebastian Stüker 2

Peter Vickers 2

Belen Alastruey 1

Antonios Anastasopoulos 1

Bismarck Bamfo Odoom 1

Samir Bennacef 1

Luisa Bentivogli 1

Antoine Bordes 1

Florian Boudin 1

Christian Buck 1

Franck Burlot 1

Alessandro Cattelan 1

Rolando Cattoni 1

Roldano Cattoni 1

Firas Chaabani 1

Christophe Declercq 1

Georgiana Dinu 1

Clara Emmanuel 1

Antonio Farina 1

Margot Fonteyne 1

Mikel L. Forcada 1

Paola Leibny Garcia 1

Ulrich Germann 1

Sahar Ghannay 1

Veneta Haralampieva 1

Kevin Heffernan 1

Dávid Javorský 1

Elahe Kalbassi 1

Věra Kloudová 1

Maarit Koponen 1

Artyom Kozhevnikov 1

Germán Kruszewski 1

Surafel Lakew 1

Guillaume Lample 1

Antoine Laurent 1

Nikola Ljubešić 1

Domenico Lupinetti 1

Shervin Malmasi 1

Andrea Martines 1

Francisco Massa 1

Alberto Massidda 1

Prashant Mathur 1

Alexandre Mourachko 1

Alex Mourachko 1

Armando Muscariello 1

Mathias Müller 1

Maria Nadejde 1

Satoshi Nakamura 1

Aurelie Neveol 1

Mariana Neves 1

Spyridon Pilos 1

Hugo Riguidel 1

Mickael Rouvier 1

Andrew Rufener 1

Ramon Sanabria 1

Jean Senellart 1

Matthias Sperber 1

Katsuhito Sudoh 1

Alex Ter-Sarkisov 1

Marco Trombetti 1

Joachim Van Den Bogaert 1

Stephan Vanni 1

Giorgos Vernikos 1

Sebastian T. Vincent 1

Sebastian Vincent 1

Yogesh Virkar 1

Changhan Wang 1

Shinji Watanabe 1

Matthew Wiesner 1

François Yvon 1

Venues