Lívia Vicente Dutra
Also published as: Lívia Vicente Dutra
2025
Audition: A Frame-Annotated Multimodal Dataset for Accessible Audiovisual Content
Maucha Andrade Gamonal | Tiago Timponi Torrent | Ely Edison Matos | Adriana S. Pagano | Frederico Belcavello | Flávia Affonso Mayer | Arthur Lorenzi | Natalia S. Sigiliano | Helen de Andrade Abreu | Lívia Vicente Dutra | Marcelo Viridiano | André Coneglian | Victor A. S. Herbst | Franciany O. Campos | Kenneth Brown | Lívia Padua Ruiz | Lisandra Carvalho Bonoto | Luiz Fernando Pereira | Yulla Liquer Navarro
Proceedings of the 21st Joint ACL - ISO Workshop on Interoperable Semantic Annotation (ISA-21)
Maucha Andrade Gamonal | Tiago Timponi Torrent | Ely Edison Matos | Adriana S. Pagano | Frederico Belcavello | Flávia Affonso Mayer | Arthur Lorenzi | Natalia S. Sigiliano | Helen de Andrade Abreu | Lívia Vicente Dutra | Marcelo Viridiano | André Coneglian | Victor A. S. Herbst | Franciany O. Campos | Kenneth Brown | Lívia Padua Ruiz | Lisandra Carvalho Bonoto | Luiz Fernando Pereira | Yulla Liquer Navarro
Proceedings of the 21st Joint ACL - ISO Workshop on Interoperable Semantic Annotation (ISA-21)
This paper presents a multimodal semantic analysis of accessible Brazilian short films using a frame-based annotation approach. We introduce a subset of the Audition dataset, comprising six short films from the animation and documentary genres. We analysed three communicative modes: original audio, audio description, and visual content. Trained annotators semantically annotated each mode following the FrameNet Brazil multimodal methodology. To compare meaning across modalities, we used cosine similarity over frame-semantic representations. Results show that audio description aligns more closely with video content than original audio, reflecting its role in translating visual meaning into language. Our findings demonstrate the effectiveness of frame semantics in modelling meaning across modalities and provide quantitative evidence of audio description as a bridge between visual and verbal communication. The dataset and annotation strategies are a valuable resource for research on multimodal representation, semantic similarity, and accessible media.
2024
Framed Multi30K: A Frame-Based Multimodal-Multilingual Dataset
Marcelo Viridiano | Arthur Lorenzi | Tiago Timponi Torrent | Ely E. Matos | Adriana S. Pagano | Natália Sathler Sigiliano | Maucha Gamonal | Helen de Andrade Abreu | Lívia Vicente Dutra | Mairon Samagaio | Mariane Carvalho | Franciany Campos | Gabrielly Azalim | Bruna Mazzei | Mateus Fonseca de Oliveira | Ana Carolina Luz | Livia Padua Ruiz | Júlia Bellei | Amanda Pestana | Josiane Costa | Iasmin Rabelo | Anna Beatriz Silva | Raquel Roza | Mariana Souza Mota | Igor Oliveira | Márcio Henrique Pelegrino de Freitas
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Marcelo Viridiano | Arthur Lorenzi | Tiago Timponi Torrent | Ely E. Matos | Adriana S. Pagano | Natália Sathler Sigiliano | Maucha Gamonal | Helen de Andrade Abreu | Lívia Vicente Dutra | Mairon Samagaio | Mariane Carvalho | Franciany Campos | Gabrielly Azalim | Bruna Mazzei | Mateus Fonseca de Oliveira | Ana Carolina Luz | Livia Padua Ruiz | Júlia Bellei | Amanda Pestana | Josiane Costa | Iasmin Rabelo | Anna Beatriz Silva | Raquel Roza | Mariana Souza Mota | Igor Oliveira | Márcio Henrique Pelegrino de Freitas
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
This paper presents Framed Multi30K (FM30K), a novel frame-based Brazilian Portuguese multimodal-multilingual dataset which i) extends the Multi30K dataset (Elliot et al., 2016) with 158,915 original Brazilian Portuguese descriptions, and 30,104 Brazilian Portuguese translations from original English descriptions; and ii) adds 2,677,613 frame evocation labels to the 158,915 English descriptions and to the ones created for Brazilian Portuguese; (iii) extends the Flickr30k Entities dataset (Plummer et al., 2015) with 190,608 frames and Frame Elements correlations with the existing phrase-to-region correlations.
Frame2: A FrameNet-based Multimodal Dataset for Tackling Text-image Interactions in Video
Frederico Belcavello | Tiago Timponi Torrent | Ely E. Matos | Adriana S. Pagano | Maucha Gamonal | Natalia Sigiliano | Lívia Vicente Dutra | Helen de Andrade Abreu | Mairon Samagaio | Mariane Carvalho | Franciany Campos | Gabrielly Azalim | Bruna Mazzei | Mateus Fonseca de Oliveira | Ana Carolina Loçasso Luz | Lívia Pádua Ruiz | Júlia Bellei | Amanda Pestana | Josiane Costa | Iasmin Rabelo | Anna Beatriz Silva | Raquel Roza | Mariana Souza | Igor Oliveira
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Frederico Belcavello | Tiago Timponi Torrent | Ely E. Matos | Adriana S. Pagano | Maucha Gamonal | Natalia Sigiliano | Lívia Vicente Dutra | Helen de Andrade Abreu | Mairon Samagaio | Mariane Carvalho | Franciany Campos | Gabrielly Azalim | Bruna Mazzei | Mateus Fonseca de Oliveira | Ana Carolina Loçasso Luz | Lívia Pádua Ruiz | Júlia Bellei | Amanda Pestana | Josiane Costa | Iasmin Rabelo | Anna Beatriz Silva | Raquel Roza | Mariana Souza | Igor Oliveira
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
This paper presents the Frame2 dataset, a multimodal dataset built from a corpus of a Brazilian travel TV show annotated for FrameNet categories for both the text and image communicative modes. Frame2 comprises 230 minutes of video, which are correlated with 2,915 sentences either transcribing the audio spoken during the episodes or the subtitling segments of the show where the host conducts interviews in English. For this first release of the dataset, a total of 11,796 annotation sets for the sentences and 6,841 for the video are included. Each of the former includes a target lexical unit evoking a frame or one or more frame elements. For each video annotation, a bounding box in the image is correlated with a frame, a frame element and lexical unit evoking a frame in FrameNet.
Search
Fix author
Co-authors
- Helen de Andrade Abreu 3
- Adriana S. Pagano 3
- Lívia Pádua Ruiz 3
- Tiago Timponi Torrent 3
- Gabrielly Azalim 2
- Frederico Belcavello 2
- Júlia Bellei 2
- Franciany Campos 2
- Mariane Carvalho 2
- Josiane Costa 2
- Maucha Gamonal 2
- Arthur Lorenzi 2
- Ely E. Matos 2
- Bruna Mazzei 2
- Igor Oliveira 2
- Amanda Pestana 2
- Iasmin Rabelo 2
- Raquel Roza 2
- Mairon Samagaio 2
- Anna Beatriz Silva 2
- Marcelo Viridiano 2
- Mateus Fonseca de Oliveira 2
- Lisandra Carvalho Bonoto 1
- Kenneth Brown 1
- Franciany O. Campos 1
- André Coneglian 1
- Maucha Andrade Gamonal 1
- Victor A. S. Herbst 1
- Ana Carolina Luz 1
- Ana Carolina Loçasso Luz 1
- Ely Edison Matos 1
- Flávia Affonso Mayer 1
- Yulla Liquer Navarro 1
- Márcio Henrique Pelegrino de Freitas 1
- Luiz Fernando Pereira 1
- Natália Sathler Sigiliano 1
- Natalia Sigiliano 1
- Natalia S. Sigiliano 1
- Mariana Souza 1
- Mariana Souza Mota 1