Flávia Affonso Mayer
2025
Audition: A Frame-Annotated Multimodal Dataset for Accessible Audiovisual Content
Maucha Andrade Gamonal | Tiago Timponi Torrent | Ely Edison Matos | Adriana S. Pagano | Frederico Belcavello | Flávia Affonso Mayer | Arthur Lorenzi | Natalia S. Sigiliano | Helen de Andrade Abreu | Lívia Vicente Dutra | Marcelo Viridiano | André Coneglian | Victor A. S. Herbst | Franciany O. Campos | Kenneth Brown | Lívia Padua Ruiz | Lisandra Carvalho Bonoto | Luiz Fernando Pereira | Yulla Liquer Navarro
Proceedings of the 21st Joint ACL - ISO Workshop on Interoperable Semantic Annotation (ISA-21)
Maucha Andrade Gamonal | Tiago Timponi Torrent | Ely Edison Matos | Adriana S. Pagano | Frederico Belcavello | Flávia Affonso Mayer | Arthur Lorenzi | Natalia S. Sigiliano | Helen de Andrade Abreu | Lívia Vicente Dutra | Marcelo Viridiano | André Coneglian | Victor A. S. Herbst | Franciany O. Campos | Kenneth Brown | Lívia Padua Ruiz | Lisandra Carvalho Bonoto | Luiz Fernando Pereira | Yulla Liquer Navarro
Proceedings of the 21st Joint ACL - ISO Workshop on Interoperable Semantic Annotation (ISA-21)
This paper presents a multimodal semantic analysis of accessible Brazilian short films using a frame-based annotation approach. We introduce a subset of the Audition dataset, comprising six short films from the animation and documentary genres. We analysed three communicative modes: original audio, audio description, and visual content. Trained annotators semantically annotated each mode following the FrameNet Brazil multimodal methodology. To compare meaning across modalities, we used cosine similarity over frame-semantic representations. Results show that audio description aligns more closely with video content than original audio, reflecting its role in translating visual meaning into language. Our findings demonstrate the effectiveness of frame semantics in modelling meaning across modalities and provide quantitative evidence of audio description as a bridge between visual and verbal communication. The dataset and annotation strategies are a valuable resource for research on multimodal representation, semantic similarity, and accessible media.