Using Machine Translation to Augment Multilingual Classification

Adam King

Using Machine Translation to Augment Multilingual Classification

Abstract

An all-too-present bottleneck for text classification model development is the need to annotate training data and this need is multiplied for multilingual classifiers. Fortunately, contemporary machine translation models are both easily accessible and have dependable translation quality, making it possible to translate labeled training data from one language into another. Here, we explore the effects of using machine translation to fine-tune a multilingual model for a classification task across multiple languages. We also investigate the benefits of using a novel technique, originally proposed in the field of image captioning, to account for potential negative effects of tuning models on translated data. We show that translated data are of sufficient quality to tune multilingual classifiers and that this novel loss technique is able to offer some improvement over models tuned without it.

Anthology ID:: 2024.eamt-1.9
Volume:: Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)
Month:: June
Year:: 2024
Address:: Sheffield, UK
Editors:: Carolina Scarton, Charlotte Prescott, Chris Bayliss, Chris Oakley, Joanna Wright, Stuart Wrigley, Xingyi Song, Edward Gow-Smith, Rachel Bawden, Víctor M Sánchez-Cartagena, Patrick Cadwell, Ekaterina Lapshinova-Koltunski, Vera Cabarrão, Konstantinos Chatzitheodorou, Mary Nurminen, Diptesh Kanojia, Helena Moniz
Venue:: EAMT
SIG:
Publisher:: European Association for Machine Translation (EAMT)
Note:
Pages:: 59–67
Language:
URL:: https://aclanthology.org/2024.eamt-1.9/
DOI:
Bibkey:
Cite (ACL):: Adam King. 2024. Using Machine Translation to Augment Multilingual Classification. In Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1), pages 59–67, Sheffield, UK. European Association for Machine Translation (EAMT).
Cite (Informal):: Using Machine Translation to Augment Multilingual Classification (King, EAMT 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.eamt-1.9.pdf

PDF Cite Search Fix data