Translating Domain-Specific Terminology in Typologically-Diverse Languages: A Study in Tax and Financial Education

Arturo Oncevay; Elena Kochkina; Keshav Ramani; Toyin Aguda; Simerjot Kaur; Charese Smiley

doi:10.18653/v1/2025.emnlp-main.1774

Translating Domain-Specific Terminology in Typologically-Diverse Languages: A Study in Tax and Financial Education

Arturo Oncevay, Elena Kochkina, Keshav Ramani, Toyin Aguda, Simerjot Kaur, Charese Smiley

Abstract

Domain-specific multilingual terminology is essential for accurate machine translation (MT) and cross-lingual NLP applications. We present a gold-standard terminology resource for the tax and financial education domains, built from curated governmental publications and covering seven typologically diverse languages: English, Spanish, Russian, Vietnamese, Korean, Chinese (traditional and simplified) and Haitian Creole. Using this resource, we assess various MT systems and LLMs on translation quality and term accuracy. We annotate over 3,000 terms for domain-specificity, facilitating a comparison between domain-specific and general term translations, and observe models’ challenges with specialized tax terms. We also analyze the case of terminology-aided translation, and the LLMs’ performance in extracting the translated term given the context. Our results highlight model limitations and the value of high-quality terminologies for advancing MT research in specialized contexts.

Anthology ID:: 2025.emnlp-main.1774
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 35030–35044
Language:
URL:: https://aclanthology.org/2025.emnlp-main.1774/
DOI:: 10.18653/v1/2025.emnlp-main.1774
Bibkey:
Cite (ACL):: Arturo Oncevay, Elena Kochkina, Keshav Ramani, Toyin Aguda, Simerjot Kaur, and Charese Smiley. 2025. Translating Domain-Specific Terminology in Typologically-Diverse Languages: A Study in Tax and Financial Education. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 35030–35044, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Translating Domain-Specific Terminology in Typologically-Diverse Languages: A Study in Tax and Financial Education (Oncevay et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.1774.pdf
Checklist:: 2025.emnlp-main.1774.checklist.pdf

PDF Cite Search Checklist Fix data