2024
pdf
bib
abs
xTower: A Multilingual LLM for Explaining and Correcting Translation Errors
Marcos V Treviso
|
Nuno M Guerreiro
|
Sweta Agrawal
|
Ricardo Rei
|
José Pombal
|
Tania Vaz
|
Helena Wu
|
Beatriz Silva
|
Daan Van Stigt
|
Andre Martins
Findings of the Association for Computational Linguistics: EMNLP 2024
While machine translation (MT) systems are achieving increasingly strong performance on benchmarks, they often produce translations with errors and anomalies. Understanding these errors can potentially help improve the translation quality and user experience. This paper introduces xTower, an open large language model (LLM) built on top of TowerBase designed to provide free-text explanations for translation errors in order to guide the generation of a corrected translation. The quality of the generated explanations by xTower are assessed via both intrinsic and extrinsic evaluation. We ask expert translators to evaluate the quality of the explanations across two dimensions: relatedness towards the error span being explained and helpfulness in error understanding and improving translation quality. Extrinsically, we test xTower across various experimental setups in generating translation corrections, demonstrating significant improvements in translation quality. Our findings highlight xTower’s potential towards not only producing plausible and helpful explanations of automatic translations, but also leveraging them to suggest corrected translations.
pdf
bib
abs
Tower v2: Unbabel-IST 2024 Submission for the General MT Shared Task
Ricardo Rei
|
Jose Pombal
|
Nuno M. Guerreiro
|
João Alves
|
Pedro Henrique Martins
|
Patrick Fernandes
|
Helena Wu
|
Tania Vaz
|
Duarte Alves
|
Amin Farajian
|
Sweta Agrawal
|
Antonio Farinhas
|
José G. C. De Souza
|
André Martins
Proceedings of the Ninth Conference on Machine Translation
In this work, we present Tower v2, an improved iteration of the state-of-the-art open-weight Tower models, and the backbone of our submission to the WMT24 General Translation shared task. Tower v2 introduces key improvements including expanded language coverage, enhanced data quality, and increased model capacity up to 70B parameters. Our final submission combines these advancements with quality-aware decoding strategies, selecting translations based on multiple translation quality signals. The resulting system demonstrates significant improvement over previous versions, outperforming closed commercial systems like GPT-4o, Claude 3.5, and DeepL even at a smaller 7B scale.
pdf
bib
abs
Improving Context Usage for Translating Bilingual Customer Support Chat with Large Language Models
Jose Pombal
|
Sweta Agrawal
|
André Martins
Proceedings of the Ninth Conference on Machine Translation
This paper describes Unbabel+IT’s submission to the Chat Shared Task held at the Workshop of Machine Translation 2024. The task focuses on translating customer support chats between agents and customers communicating in different languages. We present two strategies for adapting state-of-the-art language models to better utilize contextual information when translating such conversations. Our training strategy involves finetuning the model on chat datasets with context-augmented instructions, resulting in a specialized model, TOWERCHAT. For inference, we propose a novel quality-aware decoding approach that leverages a context-aware metric, CONTEXTCOMET, to select the optimal translation from a pool of candidates. We evaluate our proposed approach on the official shared task datasets for ten language pairs, showing that our submission consistently outperforms baselines on all and competing systems on 8 out of 10 language pairs across multiple automated metrics. Remarkably, TOWERCHAT outperforms our contrastive submission based on the much larger TOWER-V2-70B model while being 10× smaller. According to human evaluation, our system outperforms all other systems and baselines across all language pairs. These results underscore the importance of context-aware training and inference in handling complex bilingual dialogues.
2023
pdf
bib
abs
Steering Large Language Models for Machine Translation with Finetuning and In-Context Learning
Duarte Alves
|
Nuno Guerreiro
|
João Alves
|
José Pombal
|
Ricardo Rei
|
José de Souza
|
Pierre Colombo
|
Andre Martins
Findings of the Association for Computational Linguistics: EMNLP 2023
Large language models (LLMs) are a promising avenue for machine translation (MT). However, current LLM-based MT systems are brittle: their effectiveness highly depends on the choice of few-shot examples and they often require extra post-processing due to overgeneration. Alternatives such as finetuning on translation instructions are computationally expensive and may weaken in-context learning capabilities, due to overspecialization. In this paper, we provide a closer look at this problem. We start by showing that adapter-based finetuning with LoRA matches the performance of traditional finetuning while reducing the number of training parameters by a factor of 50. This method also outperforms few-shot prompting and eliminates the need for post-processing or in-context examples. However, we show that finetuning generally degrades few-shot performance, hindering adaptation capabilities. Finally, to obtain the best of both worlds, we propose a simple approach that incorporates few-shot examples during finetuning. Experiments on 10 language pairs show that our proposed approach recovers the original few-shot capabilities while keeping the added benefits of finetuning.