Mariana Souza Mota
2024
Framed Multi30K: A Frame-Based Multimodal-Multilingual Dataset
Marcelo Viridiano
|
Arthur Lorenzi
|
Tiago Timponi Torrent
|
Ely E. Matos
|
Adriana S. Pagano
|
Natália Sathler Sigiliano
|
Maucha Gamonal
|
Helen de Andrade Abreu
|
Lívia Vicente Dutra
|
Mairon Samagaio
|
Mariane Carvalho
|
Franciany Campos
|
Gabrielly Azalim
|
Bruna Mazzei
|
Mateus Fonseca de Oliveira
|
Ana Carolina Luz
|
Livia Padua Ruiz
|
Júlia Bellei
|
Amanda Pestana
|
Josiane Costa
|
Iasmin Rabelo
|
Anna Beatriz Silva
|
Raquel Roza
|
Mariana Souza Mota
|
Igor Oliveira
|
Márcio Henrique Pelegrino de Freitas
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
This paper presents Framed Multi30K (FM30K), a novel frame-based Brazilian Portuguese multimodal-multilingual dataset which i) extends the Multi30K dataset (Elliot et al., 2016) with 158,915 original Brazilian Portuguese descriptions, and 30,104 Brazilian Portuguese translations from original English descriptions; and ii) adds 2,677,613 frame evocation labels to the 158,915 English descriptions and to the ones created for Brazilian Portuguese; (iii) extends the Flickr30k Entities dataset (Plummer et al., 2015) with 190,608 frames and Frame Elements correlations with the existing phrase-to-region correlations.
Search
Co-authors
- Marcelo Viridiano 1
- Arthur Lorenzi 1
- Tiago Timponi Torrent 1
- Ely E. Matos 1
- Adriana S. Pagano 1
- show all...
- Natália Sathler Sigiliano 1
- Maucha Gamonal 1
- Helen de Andrade Abreu 1
- Lívia Vicente Dutra 1
- Mairon Samagaio 1
- Mariane Carvalho 1
- Franciany Campos 1
- Gabrielly Azalim 1
- Bruna Mazzei 1
- Mateus Fonseca de Oliveira 1
- Ana Carolina Luz 1
- Lívia Pádua Ruiz 1
- Júlia Bellei 1
- Amanda Pestana 1
- Josiane Costa 1
- Iasmin Rabelo 1
- Anna Beatriz Silva 1
- Raquel Roza 1
- Igor Oliveira 1
- Márcio Henrique Pelegrino de Freitas 1