Predicting post-editor profiles from the translation process

Karan Singla, David Orrego-Carmona, Ashleigh Rhea Gonzales, Michael Carl, Srinivas Bangalore


Abstract
The purpose of the current investigation is to predict post-editor profiles based on user behaviour and demographics using machine learning techniques to gain a better understanding of post-editor styles. Our study extracts process unit features from the CasMaCat LS14 database from the CRITT Translation Process Research Database (TPR-DB). The analysis has two main research goals: We create n-gram models based on user activity and part-of-speech sequences to automatically cluster post-editors, and we use discriminative classifier models to characterize post-editors based on a diverse range of translation process features. The classification and clustering of participants resulting from our study suggest this type of exploration could be used as a tool to develop new translation tool features or customization possibilities.
Anthology ID:
2014.amta-workshop.6
Volume:
Workshop on interactive and adaptive machine translation
Month:
October 22
Year:
2014
Address:
Vancouver, Canada
Editors:
Francisco Casacuberta, Marcello Federico, Philipp Koehn
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
51–60
Language:
URL:
https://aclanthology.org/2014.amta-workshop.6
DOI:
Bibkey:
Cite (ACL):
Karan Singla, David Orrego-Carmona, Ashleigh Rhea Gonzales, Michael Carl, and Srinivas Bangalore. 2014. Predicting post-editor profiles from the translation process. In Workshop on interactive and adaptive machine translation, pages 51–60, Vancouver, Canada. Association for Machine Translation in the Americas.
Cite (Informal):
Predicting post-editor profiles from the translation process (Singla et al., AMTA 2014)
Copy Citation:
PDF:
https://aclanthology.org/2014.amta-workshop.6.pdf