Current Shortcomings of Machine Translation in Spanish and Bulgarian Vis-à-vis English

Travis Sorenson

Current Shortcomings of Machine Translation in Spanish and Bulgarian Vis-à-vis English

Abstract

In late 2016, Google Translate (GT), widely considered a machine translation leader, replaced its statistical machine translation (SMT) functions with a neural machine translation (NMT) model for many large languages, including Spanish, with other languages following thereafter. Whereas the capabilities of GT had previously advanced incrementally, this switch to NMT resulted in seemingly exponential improvement. However, half a dozen years later, while recognizing GT’s usefulness, it is also imperative to systematically evaluate ongoing shortcomings, including determining which challenges may reasonably be presumed as superable over time and those which, following a multiyear tracking study, prove unlikely ever to be fully resolved. While the research in question principally explores Spanish-English-Spanish machine translation, this paper examines similar problems with Bulgarian-English-Bulgarian GT renditions. Better understanding both the strengths and weaknesses of current machine translation applications is fundamental to knowing when such non-human natural language processing (NLP) technology is capable of performing all or most of a given task, and when heavy, perhaps even exclusive human intervention is still required.

Anthology ID:: 2022.clib-1.20
Volume:: Proceedings of the Fifth International Conference on Computational Linguistics in Bulgaria (CLIB 2022)
Month:: September
Year:: 2022
Address:: Sofia, Bulgaria
Venue:: CLIB
SIG:
Publisher:: Department of Computational Linguistics, IBL -- BAS
Note:
Pages:: 171–180
Language:
URL:: https://aclanthology.org/2022.clib-1.20
DOI:
Bibkey:
Cite (ACL):: Travis Sorenson. 2022. Current Shortcomings of Machine Translation in Spanish and Bulgarian Vis-à-vis English. In Proceedings of the Fifth International Conference on Computational Linguistics in Bulgaria (CLIB 2022), pages 171–180, Sofia, Bulgaria. Department of Computational Linguistics, IBL -- BAS.
Cite (Informal):: Current Shortcomings of Machine Translation in Spanish and Bulgarian Vis-à-vis English (Sorenson, CLIB 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.clib-1.20.pdf

PDF Cite Search