Damien Hansen


2023

pdf bib
The MAKE-NMTVIZ System Description for the WMT23 Literary Task
Fabien Lopez | Gabriela González | Damien Hansen | Mariam Nakhle | Behnoosh Namdarzadeh | Nicolas Ballier | Marco Dinarelli | Emmanuelle Esperança-Rodier | Sui He | Sadaf Mohseni | Caroline Rossi | Didier Schwab | Jun Yang | Jean-Baptiste Yunès | Lichao Zhu
Proceedings of the Eighth Conference on Machine Translation

This paper describes the MAKE-NMTVIZ Systems trained for the WMT 2023 Literary task. As a primary submission, we used Train, Valid1, test1 as part of the GuoFeng corpus (Wang et al., 2023) to fine-tune the mBART50 model with Chinese-English data. We followed very similar training parameters to (Lee et al. 2022) when fine-tuning mBART50. We trained for 3 epochs, using gelu as an activation function, with a learning rate of 0.05, dropout of 0.1 and a batch size of 16. We decoded using a beam search of size 5. For our contrastive1 submission, we implemented a fine-tuned concatenation transformer (Lupo et al., 2023). The training was developed in two steps: (i) a sentence-level transformer was implemented for 10 epochs trained using general, test1, and valid1 data (more details in contrastive2 system); (ii) second, we fine-tuned at document-level using 3-sentence concatenation for 4 epochs using train, test2, and valid2 data. During the fine-tuning, we used ReLU as an activation function, with an inverse square root learning rate, dropout of 0.1, and a batch size of 64. We decoded using a beam search of size. Four our contrastive2 and last submission, we implemented a sentence-level transformer model (Vaswani et al., 2017). The model was trained with general data for 10 epochs using general-purpose, test1, and valid 1 data. The training parameters were an inverse square root scheduled learning rate, a dropout of 0.1, and a batch size of 64. We decoded using a beam search of size 4. We then compared the three translation outputs from an interdisciplinary perspective, investigating some of the effects of sentence- vs document-based training. Computer scientists, translators and corpus linguists discussed the linguistic remaining issues for this discourse-level literary translation.

2022

pdf bib
A Snapshot into the Possibility of Video Game Machine Translation
Damien Hansen | Pierre-Yves Houlmont
Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track)

In this article, we trained what we believe to be the first MT system adapted to video game translation and show that very limited in-domain data is enough to largely surpass publicly available systems, while also revealing interesting findings in the final translation. After introducing some of the challenges of video game translation, existing literature, as well as the systems and data sets used in this experiment, we provide and discuss the resulting translation as well as the potential benefits of such a system. We find that the model is able to learn typical rules and patterns of video game translations from English into French, indicating that the case of video game machine translation could prove useful given the encouraging results and the specific working conditions of translators this field. As with other use cases of MT in cultural sectors, however, we believe this is heavily dependent on the proper implementation of the tool, which we think could to stimulate creativity.

2021

pdf bib
Les lettres et la machine : un état de l’art en traduction littéraire automatique (Machines in the humanities: current state of the art in literary machine translation)
Damien Hansen
Actes de la 28e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 2 : 23e REncontres jeunes Chercheurs en Informatique pour le TAL (RECITAL)

Étant donné la récente vague d’intérêt pour la traduction littéraire automatique, cet article vise à recenser les travaux déjà parus sur le sujet, tout en partageant quelques prises de position sur ce thème. Nous commencerons par présenter les travaux précurseurs qui ont motivé ces différentes recherches, ainsi que les résultats obtenus plus récemment dans divers scénarios et pour diverses paires de langues. Pour terminer ce tour d’horizon, nous exposerons les débuts de nos travaux pour la paire anglaisfrançais, avant d’évoquer les préoccupations et les avantages à prendre en compte dans les discussions autour de cette technologie.