Anti-LM Decoding for Zero-shot In-context Machine Translation

Suzanna Sia, Alexandra DeLucia, Kevin Duh


Abstract
Zero-shot In-context learning is the phenomenon where models can perform a task given only the instructions. However, pre-trained large language models are known to be poorly calibrated for zero-shot tasks. One of the most effective approaches to handling this bias is to adopt a contrastive decoding objective, which accounts for the prior probability of generating the next token by conditioning on a context. This work introduces an Anti-Language Model objective with a decay factor designed to address the weaknesses of In-context Machine Translation. We conduct our experiments across 3 model types and sizes, 3 language directions, and for both greedy decoding and beam search. The proposed method outperforms other state-of-the-art decoding objectives, with up to 20 BLEU point improvement from the default objective in some settings.
Anthology ID:
2024.findings-naacl.216
Volume:
Findings of the Association for Computational Linguistics: NAACL 2024
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3403–3420
Language:
URL:
https://aclanthology.org/2024.findings-naacl.216
DOI:
Bibkey:
Cite (ACL):
Suzanna Sia, Alexandra DeLucia, and Kevin Duh. 2024. Anti-LM Decoding for Zero-shot In-context Machine Translation. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 3403–3420, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Anti-LM Decoding for Zero-shot In-context Machine Translation (Sia et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-naacl.216.pdf
Copyright:
 2024.findings-naacl.216.copyright.pdf