On the Multilingual Capabilities of Very Large-Scale English Language Models

Jordi Armengol - Estape; Ona de Gibert Bonet; Maite Melero

On the Multilingual Capabilities of Very Large-Scale English Language Models

Jordi Armengol-Estapé, Ona de Gibert Bonet, Maite Melero

Abstract

Generative Pre-trained Transformers (GPTs) have recently been scaled to unprecedented sizes in the history of machine learning. These models, solely trained on the language modeling objective, have been shown to exhibit outstanding zero, one, and few-shot learning capabilities in a number of different tasks. Nevertheless, aside from anecdotal experiences, little is known regarding their multilingual capabilities, given the fact that the pre-training corpus is almost entirely composed of English text. In this work, we investigate its potential and limits in three tasks: extractive question-answering, text summarization and natural language generation for five different languages, as well as the effect of scale in terms of model size. Our results show that GPT-3 can be almost as useful for many languages as it is for English, with room for improvement if optimization of the tokenization is addressed.

Anthology ID:: 2022.lrec-1.327
Volume:: Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:: June
Year:: 2022
Address:: Marseille, France
Editors:: Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association
Note:
Pages:: 3056–3068
Language:
URL:: https://aclanthology.org/2022.lrec-1.327
DOI:
Bibkey:
Cite (ACL):: Jordi Armengol-Estapé, Ona de Gibert Bonet, and Maite Melero. 2022. On the Multilingual Capabilities of Very Large-Scale English Language Models. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3056–3068, Marseille, France. European Language Resources Association.
Cite (Informal):: On the Multilingual Capabilities of Very Large-Scale English Language Models (Armengol-Estapé et al., LREC 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.lrec-1.327.pdf
Code: temu-bsc/gpt3-queries + additional community code

PDF Cite Search Code