Vassilina Nikoulina


2021

pdf bib
Do Multilingual Neural Machine Translation Models Contain Language Pair Specific Attention Heads?
Zae Myung Kim | Laurent Besacier | Vassilina Nikoulina | Didier Schwab
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Naver Labs Europe’s Participation in the Robustness, Chat, and Biomedical Tasks at WMT 2020
Alexandre Berard | Ioan Calapodescu | Vassilina Nikoulina | Jerin Philip
Proceedings of the Fifth Conference on Machine Translation

This paper describes Naver Labs Europe’s participation in the Robustness, Chat, and Biomedical Translation tasks at WMT 2020. We propose a bidirectional German-English model that is multi-domain, robust to noise, and which can translate entire documents (or bilingual dialogues) at once. We use the same ensemble of such models as our primary submission to all three tasks and achieve competitive results. We also experiment with language model pre-training techniques and evaluate their impact on robustness to noise and out-of-domain translation. For German, Spanish, Italian, and French to English translation in the Biomedical Task, we also submit our recently released multilingual Covid19NMT model.

pdf bib
A Multilingual Neural Machine Translation Model for Biomedical Data
Alexandre Bérard | Zae Myung Kim | Vassilina Nikoulina | Eunjeong Lucy Park | Matthias Gallé
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

We release a multilingual neural machine translation model, which can be used to translate text in the biomedical domain. The model can translate from 5 languages (French, German, Italian, Korean and Spanish) into English. It is trained with large amounts of generic and biomedical data, using domain tags. Our benchmarks show that it performs near state-of-the-art both on news (generic domain) and biomedical test sets, and that it outperforms the existing publicly released models. We believe that this release will help the large-scale multilingual analysis of the digital content of the COVID-19 crisis and of its effects on society, economy, and healthcare policies. We also release a test set of biomedical text for Korean-English. It consists of 758 sentences from official guidelines and recent papers, all about COVID-19.

2019

pdf bib
On the use of BERT for Neural Machine Translation
Stephane Clinchant | Kweon Woo Jung | Vassilina Nikoulina
Proceedings of the 3rd Workshop on Neural Generation and Translation

Exploiting large pretrained models for various NMT tasks have gained a lot of visibility recently. In this work we study how BERT pretrained models could be exploited for supervised Neural Machine Translation. We compare various ways to integrate pretrained BERT model with NMT model and study the impact of the monolingual data used for BERT training on the final translation quality. We use WMT-14 English-German, IWSLT15 English-German and IWSLT14 English-Russian datasets for these experiments. In addition to standard task test set evaluation, we perform evaluation on out-of-domain test sets and noise injected test sets, in order to assess how BERT pretrained representations affect model robustness.

pdf bib
Machine Translation of Restaurant Reviews: New Corpus for Domain Adaptation and Robustness
Alexandre Berard | Ioan Calapodescu | Marc Dymetman | Claude Roux | Jean-Luc Meunier | Vassilina Nikoulina
Proceedings of the 3rd Workshop on Neural Generation and Translation

We share a French-English parallel corpus of Foursquare restaurant reviews, and define a new task to encourage research on Neural Machine Translation robustness and domain adaptation, in a real-world scenario where better-quality MT would be greatly beneficial. We discuss the challenges of such user-generated content, and train good baseline models that build upon the latest techniques for MT robustness. We also perform an extensive evaluation (automatic and human) that shows significant improvements over existing online systems. Finally, we propose task-specific metrics based on sentiment analysis or translation accuracy of domain-specific polysemous words.

pdf bib
“Sentiment Aware Map” : exploration cartographique de points d’intérêt via l’analyse de sentiments au niveau des aspects ()
Ioan Calapodescu | Caroline Brun | Vassilina Nikoulina | Salah Aït-Mokhtar
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume IV : Démonstrations

2018

pdf bib
Aspect Based Sentiment Analysis into the Wild
Caroline Brun | Vassilina Nikoulina
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

In this paper, we test state-of-the-art Aspect Based Sentiment Analysis (ABSA) systems trained on a widely used dataset on actual data. We created a new manually annotated dataset of user generated data from the same domain as the training dataset, but from other sources and analyse the differences between the new and the standard ABSA dataset. We then analyse the results in performance of different versions of the same system on both datasets. We also propose light adaptation methods to increase system robustness.

2014

pdf bib
A Lightweight Terminology Verification Service for External Machine Translation Engines
Alessio Bosca | Vassilina Nikoulina | Marc Dymetman
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics

2012

pdf bib
Linguistically-Adapted Structural Query Annotation for Digital Libraries in the Social Sciences
Caroline Brun | Vassilina Nikoulina | Nikolaos Lagos
Proceedings of the 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities

pdf bib
Hybrid Adaptation of Named Entity Recognition for Statistical Machine Translation
Vassilina Nikoulina | Agnes Sandor | Marc Dymetman
Proceedings of the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT

pdf bib
Adaptation of Statistical Machine Translation Model for Cross-Lingual Information Retrieval in a Service Context
Vassilina Nikoulina | Bogomil Kovachev | Nikolaos Lagos | Christof Monz
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

2008

pdf bib
Using Syntactic Coupling Features for Discriminating Phrase-Based Translations (WMT-08 Shared Translation Task)
Vassilina Nikoulina | Marc Dymetman
Proceedings of the Third Workshop on Statistical Machine Translation

pdf bib
Experiments in Discriminating Phrase-Based Translations on the Basis of Syntactic Coupling Features
Vassilina Nikoulina | Marc Dymetman
Proceedings of the ACL-08: HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2)