2024
pdf
bib
abs
Tower v2: Unbabel-IST 2024 Submission for the General MT Shared Task
Ricardo Rei
|
Jose Pombal
|
Nuno M. Guerreiro
|
João Alves
|
Pedro Henrique Martins
|
Patrick Fernandes
|
Helena Wu
|
Tania Vaz
|
Duarte Alves
|
Amin Farajian
|
Sweta Agrawal
|
Antonio Farinhas
|
José G. C. De Souza
|
André Martins
Proceedings of the Ninth Conference on Machine Translation
In this work, we present Tower v2, an improved iteration of the state-of-the-art open-weight Tower models, and the backbone of our submission to the WMT24 General Translation shared task. Tower v2 introduces key improvements including expanded language coverage, enhanced data quality, and increased model capacity up to 70B parameters. Our final submission combines these advancements with quality-aware decoding strategies, selecting translations based on multiple translation quality signals. The resulting system demonstrates significant improvement over previous versions, outperforming closed commercial systems like GPT-4o, Claude 3.5, and DeepL even at a smaller 7B scale.
2023
pdf
bib
abs
Steering Large Language Models for Machine Translation with Finetuning and In-Context Learning
Duarte Alves
|
Nuno Guerreiro
|
João Alves
|
José Pombal
|
Ricardo Rei
|
José de Souza
|
Pierre Colombo
|
Andre Martins
Findings of the Association for Computational Linguistics: EMNLP 2023
Large language models (LLMs) are a promising avenue for machine translation (MT). However, current LLM-based MT systems are brittle: their effectiveness highly depends on the choice of few-shot examples and they often require extra post-processing due to overgeneration. Alternatives such as finetuning on translation instructions are computationally expensive and may weaken in-context learning capabilities, due to overspecialization. In this paper, we provide a closer look at this problem. We start by showing that adapter-based finetuning with LoRA matches the performance of traditional finetuning while reducing the number of training parameters by a factor of 50. This method also outperforms few-shot prompting and eliminates the need for post-processing or in-context examples. However, we show that finetuning generally degrades few-shot performance, hindering adaptation capabilities. Finally, to obtain the best of both worlds, we propose a simple approach that incorporates few-shot examples during finetuning. Experiments on 10 language pairs show that our proposed approach recovers the original few-shot capabilities while keeping the added benefits of finetuning.
pdf
bib
abs
Empirical Assessment of kNN-MT for Real-World Translation Scenarios
Pedro Henrique Martins
|
João Alves
|
Tânia Vaz
|
Madalena Gonçalves
|
Beatriz Silva
|
Marianna Buchicchio
|
José G. C. de Souza
|
André F. T. Martins
Proceedings of the 24th Annual Conference of the European Association for Machine Translation
This paper aims to investigate the effectiveness of the k-Nearest Neighbor Machine Translation model (kNN-MT) in real-world scenarios. kNN-MT is a retrieval-augmented framework that combines the advantages of parametric models with non-parametric datastores built using a set of parallel sentences. Previous studies have primarily focused on evaluating the model using only the BLEU metric and have not tested kNN-MT in real world scenarios. Our study aims to fill this gap by conducting a comprehensive analysis on various datasets comprising different language pairs and different domains, using multiple automatic metrics and expert evaluated Multidimensional Quality Metrics (MQM). We compare kNN-MT with two alternate strategies: fine-tuning all the model parameters and adapter-based finetuning. Finally, we analyze the effect of the datastore size on translation quality, and we examine the number of entries necessary to bootstrap and configure the index.
2022
pdf
bib
abs
Unbabel-IST at the WMT Chat Translation Shared Task
João Alves
|
Pedro Henrique Martins
|
José G. C. de Souza
|
M. Amin Farajian
|
André F. T. Martins
Proceedings of the Seventh Conference on Machine Translation (WMT)
We present the joint contribution of IST and Unbabel to the WMT 2022 Chat Translation Shared Task. We participated in all six language directions (English ↔ German, English ↔ French, English ↔ Brazilian Portuguese). Due to the lack of domain-specific data, we use mBART50, a large pretrained language model trained on millions of sentence-pairs, as our base model. We fine-tune it using a two step fine-tuning process. In the first step, we fine-tune the model on publicly available data. In the second step, we use the validation set. After having a domain specific model, we explore the use of kNN-MT as a way of incorporating domain-specific data at decoding time.