A Comparison of the Word Similarity Measurement in English-Arabic Translation Memory Segment Retrieval Including an Inflectional Affix Intervention

Khaled Ben Milad


Abstract
The aim of this paper is to investigate the similarity measurement approach of translation memory (TM) in five representative computer-aided translation (CAT) tools when retrieving inflectional verb-variation sentences in Arabic to English translation. In English, inflectional affixes in verbs include suffixes only; unlike English, verbs in Arabic derive voice, mood, tense, number and person through various inflectional affixes e.g. pre or post a verb root. The research question focuses on establishing whether the TM similarity algorithm measures a combination of the inflectional affixes as a word or as a character intervention when retrieving a segment. If it is dealt with as a character intervention, are the types of intervention penalized equally or differently? This paper experimentally examines, through a black box testing methodology and a test suite instrument, the penalties that TM systems’ current algorithms impose when input segments and retrieved TM sources are exactly the same, except for a difference in an inflectional affix. It would be expected that, if TM systems had some linguistic knowledge, the penalty would be very light, which would be useful to translators, since a high-scoring match would be presented near the top of the list of proposals. However, analysis of TM systems’ output shows that inflectional affixes are penalized more heavily than expected, and in different ways. They may be treated as an intervention on the whole word, or as a single character change.
Anthology ID:
2021.triton-1.14
Volume:
Proceedings of the Translation and Interpreting Technology Online Conference
Month:
July
Year:
2021
Address:
Held Online
Editors:
Ruslan Mitkov, Vilelmini Sosoni, Julie Christine Giguère, Elena Murgolo, Elizabeth Deysel
Venue:
TRITON
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
125–141
Language:
URL:
https://aclanthology.org/2021.triton-1.14
DOI:
Bibkey:
Cite (ACL):
Khaled Ben Milad. 2021. A Comparison of the Word Similarity Measurement in English-Arabic Translation Memory Segment Retrieval Including an Inflectional Affix Intervention. In Proceedings of the Translation and Interpreting Technology Online Conference, pages 125–141, Held Online. INCOMA Ltd..
Cite (Informal):
A Comparison of the Word Similarity Measurement in English-Arabic Translation Memory Segment Retrieval Including an Inflectional Affix Intervention (Ben Milad, TRITON 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.triton-1.14.pdf