Nearest-Neighbor Retrieval for Indigenous Image Captioning

Justin Vasselli; Arturo Martínez Peguero; Shintaro Ozaki; Frederikus Hudi; Haruki Sakajo; Taro Watanabe

Nearest-Neighbor Retrieval for Indigenous Image Captioning

Justin Vasselli, Arturo Martínez Peguero, Shintaro Ozaki, Frederikus Hudi, Haruki Sakajo, Taro Watanabe

Abstract

This paper describes the NAIST submission to the AmericasNLP 2026 Shared Task on Indigenous Language Image Captioning. We investigate two approaches for generating captions in Bribri, Guaraní, Nahuatl, Wixárika, and Yucatec Maya. The first is a nearest-neighbor retrieval system that uses CLIP image embeddings to retrieve the most similar image from the development set and directly reuse its caption. The second is a generation pipeline that combines scene analysis, dictionary-grounded lexical planning, retrieved gloss templates, and interlinear gloss representations to constrain generation in low-resource settings.The retrieval-based approach substantially outperformed the gloss-based pipeline under chrF++ evaluation and was competitive across all submitted systems, achieving first-place automated system rankings for Bribri and Wixárika and third place for Nahuatl. The gloss-based pipeline produced weaker automatic evaluation results and exposed problems with dictionary coverage, orthographic mismatches between resources, and unstable grammatical generation. Our results suggest that retrieval-based methods provide a strong baseline for low-resource captioning tasks when high-quality examples are available.

Anthology ID:: 2026.americasnlp-6.26
Volume:: Proceedings of the Sixth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Manuel Mager, Abteen Ebrahimi, Minh Duc Bui, Robert Pugh, Arturo Oncevay, Luis Chiruzzo, Rolando Coto Solano, Shruti Rijhwani, Katharina Von Der Wense
Venues:: AmericasNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 272–278
Language:
URL:: https://aclanthology.org/2026.americasnlp-6.26/
DOI:
Bibkey:
Cite (ACL):: Justin Vasselli, Arturo Martínez Peguero, Shintaro Ozaki, Frederikus Hudi, Haruki Sakajo, and Taro Watanabe. 2026. Nearest-Neighbor Retrieval for Indigenous Image Captioning. In Proceedings of the Sixth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP), pages 272–278, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: Nearest-Neighbor Retrieval for Indigenous Image Captioning (Vasselli et al., AmericasNLP 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.americasnlp-6.26.pdf

PDF Cite Search Fix data