David Orrego-Carmona
2014
Predicting post-editor profiles from the translation process
Karan Singla
|
David Orrego-Carmona
|
Ashleigh Rhea Gonzales
|
Michael Carl
|
Srinivas Bangalore
Workshop on interactive and adaptive machine translation
The purpose of the current investigation is to predict post-editor profiles based on user behaviour and demographics using machine learning techniques to gain a better understanding of post-editor styles. Our study extracts process unit features from the CasMaCat LS14 database from the CRITT Translation Process Research Database (TPR-DB). The analysis has two main research goals: We create n-gram models based on user activity and part-of-speech sequences to automatically cluster post-editors, and we use discriminative classifier models to characterize post-editors based on a diverse range of translation process features. The classification and clustering of participants resulting from our study suggest this type of exploration could be used as a tool to develop new translation tool features or customization possibilities.