XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples

Peiqin Lin; André F. T. Martins; Hinrich Schütze

doi:10.18653/v1/2025.findings-naacl.221

XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples

Peiqin Lin, Andre Martins, Hinrich Schuetze

Abstract

Recent studies indicate that leveraging off-the-shelf or fine-tuned retrievers, capable of retrieving relevant in-context examples tailored to the input query, enhances few-shot in-context learning of English. However, adapting these methods to other languages, especially low-resource ones, poses challenges due to the scarcity of cross-lingual retrievers and annotated data. Thus, we introduce XAMPLER: Cross-Lingual Example Retrieval, a method tailored to tackle the challenge of cross-lingual in-context learning using only annotated English data. XAMPLER first trains a retriever based on Glot500, a multilingual small language model, using positive and negative English examples constructed from the predictions of a multilingual large language model, i.e., MaLA500. Leveraging the cross-lingual capacity of the retriever, it can directly retrieve English examples as few-shot examples for in-context learning of target languages. Experiments on two multilingual text classification benchmarks, namely SIB200 with 176 languages and MasakhaNEWS with 16 languages, demonstrate that XAMPLER substantially improves the in-context learning performance across languages.

Anthology ID:: 2025.findings-naacl.221
Volume:: Findings of the Association for Computational Linguistics: NAACL 2025
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3968–3977
Language:
URL:: https://aclanthology.org/2025.findings-naacl.221/
DOI:: 10.18653/v1/2025.findings-naacl.221
Bibkey:
Cite (ACL):: Peiqin Lin, Andre Martins, and Hinrich Schuetze. 2025. XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 3968–3977, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples (Lin et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-naacl.221.pdf

PDF Cite Search Fix data