Using CollGram to Compare Formulaic Language in Human and Machine Translation

Yves Bestgen


Abstract
A comparison of formulaic sequences in human and neural machine translation of quality newspaper articles shows that neural machine translations contain less lower-frequency, but strongly-associated formulaic sequences (FSs), and more high-frequency FSs. These observations can be related to the differences between second language learners of various levels and between translated and untranslated texts. The comparison between the neural machine translation systems indicates that some systems produce more FSs of both types than other systems.
Anthology ID:
2021.triton-1.20
Volume:
Proceedings of the Translation and Interpreting Technology Online Conference
Month:
July
Year:
2021
Address:
Held Online
Editors:
Ruslan Mitkov, Vilelmini Sosoni, Julie Christine Giguère, Elena Murgolo, Elizabeth Deysel
Venue:
TRITON
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
174–180
Language:
URL:
https://aclanthology.org/2021.triton-1.20
DOI:
Bibkey:
Cite (ACL):
Yves Bestgen. 2021. Using CollGram to Compare Formulaic Language in Human and Machine Translation. In Proceedings of the Translation and Interpreting Technology Online Conference, pages 174–180, Held Online. INCOMA Ltd..
Cite (Informal):
Using CollGram to Compare Formulaic Language in Human and Machine Translation (Bestgen, TRITON 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.triton-1.20.pdf