David Orrego-Carmona


2014

pdf bib
Predicting post-editor profiles from the translation process
Karan Singla | David Orrego-Carmona | Ashleigh Rhea Gonzales | Michael Carl | Srinivas Bangalore
Workshop on interactive and adaptive machine translation

The purpose of the current investigation is to predict post-editor profiles based on user behaviour and demographics using machine learning techniques to gain a better understanding of post-editor styles. Our study extracts process unit features from the CasMaCat LS14 database from the CRITT Translation Process Research Database (TPR-DB). The analysis has two main research goals: We create n-gram models based on user activity and part-of-speech sequences to automatically cluster post-editors, and we use discriminative classifier models to characterize post-editors based on a diverse range of translation process features. The classification and clustering of participants resulting from our study suggest this type of exploration could be used as a tool to develop new translation tool features or customization possibilities.