Tânia Vaz
2023
Empirical Assessment of kNN-MT for Real-World Translation Scenarios
Pedro Henrique Martins
|
João Alves
|
Tânia Vaz
|
Madalena Gonçalves
|
Beatriz Silva
|
Marianna Buchicchio
|
José G. C. de Souza
|
André F. T. Martins
Proceedings of the 24th Annual Conference of the European Association for Machine Translation
This paper aims to investigate the effectiveness of the k-Nearest Neighbor Machine Translation model (kNN-MT) in real-world scenarios. kNN-MT is a retrieval-augmented framework that combines the advantages of parametric models with non-parametric datastores built using a set of parallel sentences. Previous studies have primarily focused on evaluating the model using only the BLEU metric and have not tested kNN-MT in real world scenarios. Our study aims to fill this gap by conducting a comprehensive analysis on various datasets comprising different language pairs and different domains, using multiple automatic metrics and expert evaluated Multidimensional Quality Metrics (MQM). We compare kNN-MT with two alternate strategies: fine-tuning all the model parameters and adapter-based finetuning. Finally, we analyze the effect of the datastore size on translation quality, and we examine the number of entries necessary to bootstrap and configure the index.
Search
Co-authors
- Pedro Henrique Martins 1
- João Alves 1
- Madalena Gonçalves 1
- Beatriz Silva 1
- Marianna Buchicchio 1
- show all...
Venues
- eamt1