Mariana Souza

2024

This paper presents the Frame2 dataset, a multimodal dataset built from a corpus of a Brazilian travel TV show annotated for FrameNet categories for both the text and image communicative modes. Frame2 comprises 230 minutes of video, which are correlated with 2,915 sentences either transcribing the audio spoken during the episodes or the subtitling segments of the show where the host conducts interviews in English. For this first release of the dataset, a total of 11,796 annotation sets for the sentences and 6,841 for the video are included. Each of the former includes a target lexical unit evoking a frame or one or more frame elements. For each video annotation, a bounding box in the image is correlated with a frame, a frame element and lexical unit evoking a frame in FrameNet.

2023

pdf bib

Coleta, composião e etapas de pre-processamento de corpus: procedimentos para a anotação multimodal da FrameNet Brasil
Anna Silva | Iasmin Rabelo | Igor Oliveira | Mariana Souza | Maucha Gamonal | Raquel Roza
Proceedings of the 14th Brazilian Symposium in Information and Human Language Technology

Mariana Souza

2024

2023

Co-authors

Venues