Towards the Machine Translation of Scientific Neologisms

Paul Lerner, François Yvon


Abstract
Scientific research continually discovers and invents new concepts, which are then referred to by new terms, neologisms, or neonyms in this context. As the vast majority of publications are written in English, disseminating this new knowledge to the general public often requires translating these terms. However, by definition, no parallel data exist to provide such translations. Therefore, we propose to leverage term definitions as a useful source of information for the translation process. As we discuss, Large Language Models are well suited for this task and can benefit from in-context learning with co-hyponyms and terms sharing the same derivation paradigm. These models, however, are sensitive to the superficial and morphological similarity between source and target terms. Their predictions are also impacted by subword tokenization, especially for prefixed terms.
Anthology ID:
2025.coling-main.63
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
947–963
Language:
URL:
https://aclanthology.org/2025.coling-main.63/
DOI:
Bibkey:
Cite (ACL):
Paul Lerner and François Yvon. 2025. Towards the Machine Translation of Scientific Neologisms. In Proceedings of the 31st International Conference on Computational Linguistics, pages 947–963, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Towards the Machine Translation of Scientific Neologisms (Lerner & Yvon, COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.63.pdf