Schema-Constrained Image Captioning for Five Low-Resource Indigenous Languages

Diego Cuadros, Nicholas Leeds, Amanda Avalos, Azul Alpizar-Velazquez, Jared Coleman, Faezeh Dehghan Tarzjani, Bhaskar Krishnamachari


Abstract
We describe our submission to all five tracks of the AmericasNLP 2026 Shared Task on Cultural Image Captioning: Bribri, Guaraní, Yucatec Maya, Orizaba Nahuatl, and Wixárika. Our system is an LLM-assisted rule-based machine translation (LLM-RBMT) captioner. For each language, a coding agent reads the small development split and open-web linguistic references and writes a complete Pydantic grammar package with a closed vocabulary. At inference time, a vision–language model sees the image and the schema, emits a structured SentenceList under constrained decoding, and a deterministic Python renderer produces the surface string. The model never generates target-language tokens. The same architecture handles all five languages with no fine-tuning, no parallel corpora, and no human edits to the generated packages. On the official test set, the system placed first on human evaluation in Bribri and Orizaba Nahuatl, third on Yucatec Maya, and first on ChrF++ in Yucatec Maya. We suggest that a strength of the approach is that outputs are restricted to simple sentences that are grammatically correct by construction, modulo the correctness of the generated grammar itself.
Anthology ID:
2026.americasnlp-6.24
Volume:
Proceedings of the Sixth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Manuel Mager, Abteen Ebrahimi, Minh Duc Bui, Robert Pugh, Arturo Oncevay, Luis Chiruzzo, Rolando Coto Solano, Shruti Rijhwani, Katharina Von Der Wense
Venues:
AmericasNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
257–263
Language:
URL:
https://aclanthology.org/2026.americasnlp-6.24/
DOI:
Bibkey:
Cite (ACL):
Diego Cuadros, Nicholas Leeds, Amanda Avalos, Azul Alpizar-Velazquez, Jared Coleman, Faezeh Dehghan Tarzjani, and Bhaskar Krishnamachari. 2026. Schema-Constrained Image Captioning for Five Low-Resource Indigenous Languages. In Proceedings of the Sixth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP), pages 257–263, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Schema-Constrained Image Captioning for Five Low-Resource Indigenous Languages (Cuadros et al., AmericasNLP 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.americasnlp-6.24.pdf
Supplementarymaterial:
 2026.americasnlp-6.24.SupplementaryMaterial.zip