Marcelo Viridiano
2026
Evaluating FrameNet-Based Semantic Modeling for Gender-Based Violence Detection in Clinical Records
Lívia Dutra | Arthur Lorenzi | Frederico Belcavello | Ely Matos | Marcelo Viridiano | Lorena Larré | Olívia Guaranha | Erick Santos | Sofia Reinach | Pedro de Paula | Tiago Torrent
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
Lívia Dutra | Arthur Lorenzi | Frederico Belcavello | Ely Matos | Marcelo Viridiano | Lorena Larré | Olívia Guaranha | Erick Santos | Sofia Reinach | Pedro de Paula | Tiago Torrent
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
Gender-based violence (GBV) is a major public health issue, with the World Health Organization estimating that one in three women experiences physical or sexual violence by an intimate partner during her lifetime. In Brazil, although healthcare professionals are legally required to report such cases, underreporting remains significant due to difficulties in identifying abuse and limited integration between public information systems. This study investigates whether FrameNet-based semantic annotation of open-text fields in electronic medical records can support the identification of patterns of GBV. We compare the performance of an SVM classifier for GBV cases trained on (1) frame-annotated text, (2) annotated text combined with parameterized data, and (3) parameterized data alone. Quantitative and qualitative analyses show that models incorporating semantic annotation outperform categorical models, achieving over 0.3 improvement in F1 score and demonstrating that domain-specific semantic representations provide meaningful signals beyond structured demographic data. The findings support the hypothesis that semantic analysis of clinical narratives can enhance early identification strategies and support more informed public health interventions.
2025
Audition: A Frame-Annotated Multimodal Dataset for Accessible Audiovisual Content
Maucha Andrade Gamonal | Tiago Timponi Torrent | Ely Edison Matos | Adriana S. Pagano | Frederico Belcavello | Flávia Affonso Mayer | Arthur Lorenzi | Natalia S. Sigiliano | Helen de Andrade Abreu | Lívia Vicente Dutra | Marcelo Viridiano | André Coneglian | Victor A. S. Herbst | Franciany O. Campos | Kenneth Brown | Lívia Padua Ruiz | Lisandra Carvalho Bonoto | Luiz Fernando Pereira | Yulla Liquer Navarro
Proceedings of the 21st Joint ACL - ISO Workshop on Interoperable Semantic Annotation (ISA-21)
Maucha Andrade Gamonal | Tiago Timponi Torrent | Ely Edison Matos | Adriana S. Pagano | Frederico Belcavello | Flávia Affonso Mayer | Arthur Lorenzi | Natalia S. Sigiliano | Helen de Andrade Abreu | Lívia Vicente Dutra | Marcelo Viridiano | André Coneglian | Victor A. S. Herbst | Franciany O. Campos | Kenneth Brown | Lívia Padua Ruiz | Lisandra Carvalho Bonoto | Luiz Fernando Pereira | Yulla Liquer Navarro
Proceedings of the 21st Joint ACL - ISO Workshop on Interoperable Semantic Annotation (ISA-21)
This paper presents a multimodal semantic analysis of accessible Brazilian short films using a frame-based annotation approach. We introduce a subset of the Audition dataset, comprising six short films from the animation and documentary genres. We analysed three communicative modes: original audio, audio description, and visual content. Trained annotators semantically annotated each mode following the FrameNet Brazil multimodal methodology. To compare meaning across modalities, we used cosine similarity over frame-semantic representations. Results show that audio description aligns more closely with video content than original audio, reflecting its role in translating visual meaning into language. Our findings demonstrate the effectiveness of frame semantics in modelling meaning across modalities and provide quantitative evidence of audio description as a bridge between visual and verbal communication. The dataset and annotation strategies are a valuable resource for research on multimodal representation, semantic similarity, and accessible media.
SHADES: Towards a Multilingual Assessment of Stereotypes in Large Language Models
Margaret Mitchell | Giuseppe Attanasio | Ioana Baldini | Miruna Clinciu | Jordan Clive | Pieter Delobelle | Manan Dey | Sil Hamilton | Timm Dill | Jad Doughman | Ritam Dutt | Avijit Ghosh | Jessica Zosa Forde | Carolin Holtermann | Lucie-Aimée Kaffee | Tanmay Laud | Anne Lauscher | Roberto L Lopez-Davila | Maraim Masoud | Nikita Nangia | Anaelia Ovalle | Giada Pistilli | Dragomir Radev | Beatrice Savoldi | Vipul Raheja | Jeremy Qin | Esther Ploeger | Arjun Subramonian | Kaustubh Dhole | Kaiser Sun | Amirbek Djanibekov | Jonibek Mansurov | Kayo Yin | Emilio Villa Cueva | Sagnik Mukherjee | Jerry Huang | Xudong Shen | Jay Gala | Hamdan Al-Ali | Tair Djanibekov | Nurdaulet Mukhituly | Shangrui Nie | Shanya Sharma | Karolina Stanczak | Eliza Szczechla | Tiago Timponi Torrent | Deepak Tunuguntla | Marcelo Viridiano | Oskar Van Der Wal | Adina Yakefu | Aurélie Névéol | Mike Zhang | Sydney Zink | Zeerak Talat
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Margaret Mitchell | Giuseppe Attanasio | Ioana Baldini | Miruna Clinciu | Jordan Clive | Pieter Delobelle | Manan Dey | Sil Hamilton | Timm Dill | Jad Doughman | Ritam Dutt | Avijit Ghosh | Jessica Zosa Forde | Carolin Holtermann | Lucie-Aimée Kaffee | Tanmay Laud | Anne Lauscher | Roberto L Lopez-Davila | Maraim Masoud | Nikita Nangia | Anaelia Ovalle | Giada Pistilli | Dragomir Radev | Beatrice Savoldi | Vipul Raheja | Jeremy Qin | Esther Ploeger | Arjun Subramonian | Kaustubh Dhole | Kaiser Sun | Amirbek Djanibekov | Jonibek Mansurov | Kayo Yin | Emilio Villa Cueva | Sagnik Mukherjee | Jerry Huang | Xudong Shen | Jay Gala | Hamdan Al-Ali | Tair Djanibekov | Nurdaulet Mukhituly | Shangrui Nie | Shanya Sharma | Karolina Stanczak | Eliza Szczechla | Tiago Timponi Torrent | Deepak Tunuguntla | Marcelo Viridiano | Oskar Van Der Wal | Adina Yakefu | Aurélie Névéol | Mike Zhang | Sydney Zink | Zeerak Talat
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Large Language Models (LLMs) reproduce and exacerbate the social biases present in their training data, and resources to quantify this issue are limited. While research has attempted to identify and mitigate such biases, most efforts have been concentrated around English, lagging the rapid advancement of LLMs in multilingual settings. In this paper, we introduce a new multilingual parallel dataset SHADES to help address this issue, designed for examining culturally-specific stereotypes that may be learned by LLMs. The dataset includes stereotypes from 20 regions around the world and 16 languages, spanning multiple identity categories subject to discrimination worldwide. We demonstrate its utility in a series of exploratory evaluations for both “base” and “instruction-tuned” language models. Our results suggest that stereotypes are consistently reflected across models and languages, with some languages and models indicating much stronger stereotype biases than others.
2024
Framed Multi30K: A Frame-Based Multimodal-Multilingual Dataset
Marcelo Viridiano | Arthur Lorenzi | Tiago Timponi Torrent | Ely E. Matos | Adriana S. Pagano | Natália Sathler Sigiliano | Maucha Gamonal | Helen de Andrade Abreu | Lívia Vicente Dutra | Mairon Samagaio | Mariane Carvalho | Franciany Campos | Gabrielly Azalim | Bruna Mazzei | Mateus Fonseca de Oliveira | Ana Carolina Luz | Livia Padua Ruiz | Júlia Bellei | Amanda Pestana | Josiane Costa | Iasmin Rabelo | Anna Beatriz Silva | Raquel Roza | Mariana Souza Mota | Igor Oliveira | Márcio Henrique Pelegrino de Freitas
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Marcelo Viridiano | Arthur Lorenzi | Tiago Timponi Torrent | Ely E. Matos | Adriana S. Pagano | Natália Sathler Sigiliano | Maucha Gamonal | Helen de Andrade Abreu | Lívia Vicente Dutra | Mairon Samagaio | Mariane Carvalho | Franciany Campos | Gabrielly Azalim | Bruna Mazzei | Mateus Fonseca de Oliveira | Ana Carolina Luz | Livia Padua Ruiz | Júlia Bellei | Amanda Pestana | Josiane Costa | Iasmin Rabelo | Anna Beatriz Silva | Raquel Roza | Mariana Souza Mota | Igor Oliveira | Márcio Henrique Pelegrino de Freitas
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
This paper presents Framed Multi30K (FM30K), a novel frame-based Brazilian Portuguese multimodal-multilingual dataset which i) extends the Multi30K dataset (Elliot et al., 2016) with 158,915 original Brazilian Portuguese descriptions, and 30,104 Brazilian Portuguese translations from original English descriptions; and ii) adds 2,677,613 frame evocation labels to the 158,915 English descriptions and to the ones created for Brazilian Portuguese; (iii) extends the Flickr30k Entities dataset (Plummer et al., 2015) with 190,608 frames and Frame Elements correlations with the existing phrase-to-region correlations.
2022
Charon: A FrameNet Annotation Tool for Multimodal Corpora
Frederico Belcavello | Marcelo Viridiano | Ely Matos | Tiago Timponi Torrent
Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022
Frederico Belcavello | Marcelo Viridiano | Ely Matos | Tiago Timponi Torrent
Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022
This paper presents Charon, a web tool for annotating multimodal corpora with FrameNet categories. Annotation can be made for corpora containing both static images and video sequences paired – or not – with text sequences. The pipeline features, besides the annotation interface, corpus import and pre-processing tools.
Lutma: A Frame-Making Tool for Collaborative FrameNet Development
Tiago Timponi Torrent | Arthur Lorenzi | Ely Edison Matos | Frederico Belcavello | Marcelo Viridiano | Maucha Andrade Gamonal
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022
Tiago Timponi Torrent | Arthur Lorenzi | Ely Edison Matos | Frederico Belcavello | Marcelo Viridiano | Maucha Andrade Gamonal
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022
This paper presents Lutma, a collaborative, semi-constrained, tutorial-based tool for contributing frames and lexical units to the Global FrameNet initiative. The tool parameterizes the process of frame creation, avoiding consistency violations and promoting the integration of frames contributed by the community with existing frames. Lutma is structured in a wizard-like fashion so as to provide users with text and video tutorials relevant for each step in the frame creation process. We argue that this tool will allow for a sensible expansion of FrameNet coverage in terms of both languages and cultural perspectives encoded by them, positioning frames as a viable alternative for representing perspective in language models.
The Case for Perspective in Multimodal Datasets
Marcelo Viridiano | Tiago Timponi Torrent | Oliver Czulo | Arthur Lorenzi | Ely Matos | Frederico Belcavello
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022
Marcelo Viridiano | Tiago Timponi Torrent | Oliver Czulo | Arthur Lorenzi | Ely Matos | Frederico Belcavello
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022
This paper argues in favor of the adoption of annotation practices for multimodal datasets that recognize and represent the inherently perspectivized nature of multimodal communication. To support our claim, we present a set of annotation experiments in which FrameNet annotation is applied to the Multi30k and the Flickr 30k Entities datasets. We assess the cosine similarity between the semantic representations derived from the annotation of both pictures and captions for frames. Our findings indicate that: (i) frame semantic similarity between captions of the same picture produced in different languages is sensitive to whether the caption is a translation of another caption or not, and (ii) picture annotation for semantic frames is sensitive to whether the image is annotated in presence of a caption or not.
2020
Frame-Based Annotation of Multimodal Corpora: Tracking (A)Synchronies in Meaning Construction
Frederico Belcavello | Marcelo Viridiano | Alexandre Diniz da Costa | Ely Edison da Silva Matos | Tiago Timponi Torrent
Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet
Frederico Belcavello | Marcelo Viridiano | Alexandre Diniz da Costa | Ely Edison da Silva Matos | Tiago Timponi Torrent
Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet
Multimodal aspects of human communication are key in several applications of Natural Language Processing, such as Machine Translation and Natural Language Generation. Despite recent advances in integrating multimodality into Computational Linguistics, the merge between NLP and Computer Vision techniques is still timid, especially when it comes to providing fine-grained accounts for meaning construction. This paper reports on research aiming to determine appropriate methodology and develop a computational tool to annotate multimodal corpora according to a principled structured semantic representation of events, relations and entities: FrameNet. Taking a Brazilian television travel show as corpus, a pilot study was conducted to annotate the frames that are evoked by the audio and the ones that are evoked by visual elements. We also implemented a Multimodal Annotation tool which allows annotators to choose frames and locate frame elements both in the text and in the images, while keeping track of the time span in which those elements are active in each modality. Results suggest that adding a multimodal domain to the linguistic layer of annotation and analysis contributes both to enrich the kind of information that can be tagged in a corpus, and to enhance FrameNet as a model of linguistic cognition.
Search
Fix author
Co-authors
- Tiago Timponi Torrent 8
- Frederico Belcavello 6
- Arthur Lorenzi 5
- Ely Edison da Silva Matos 4
- Helen de Andrade Abreu 2
- Lívia Vicente Dutra 2
- Maucha Andrade Gamonal 2
- Ely Edison Matos 2
- Adriana S. Pagano 2
- Lívia Pádua Ruiz 2
- Hamdan Al-Ali 1
- Giuseppe Attanasio 1
- Gabrielly Azalim 1
- Ioana Baldini 1
- Júlia Bellei 1
- Lisandra Carvalho Bonoto 1
- Kenneth Brown 1
- Franciany Campos 1
- Franciany O. Campos 1
- Mariane Carvalho 1
- Miruna Clinciu 1
- Jordan Clive 1
- André Coneglian 1
- Josiane Costa 1
- Alexandre Diniz da Costa 1
- Oliver Czulo 1
- Pieter Delobelle 1
- Manan Dey 1
- Kaustubh Dhole 1
- Timm Dill 1
- Amirbek Djanibekov 1
- Jad Doughman 1
- Lívia Dutra 1
- Ritam Dutt 1
- Jessica Zosa Forde 1
- Jay Gala 1
- Maucha Gamonal 1
- Avijit Ghosh 1
- Olívia Guaranha 1
- Sil Hamilton 1
- Victor A. S. Herbst 1
- Carolin Holtermann 1
- Jerry Huang 1
- Lucie-Aimée Kaffee 1
- Lorena Larré 1
- Tanmay Laud 1
- Anne Lauscher 1
- Roberto L Lopez-Davila 1
- Ana Carolina Luz 1
- Jonibek Mansurov 1
- Maraim Masoud 1
- Ely E. Matos 1
- Flávia Affonso Mayer 1
- Bruna Mazzei 1
- Margaret Mitchell 1
- Sagnik Mukherjee 1
- Nurdaulet Mukhituly 1
- Nikita Nangia 1
- Yulla Liquer Navarro 1
- Aurelie Neveol 1
- Shangrui Nie 1
- Igor Oliveira 1
- Anaelia Ovalle 1
- Pedro de Paula 1
- Márcio Henrique Pelegrino de Freitas 1
- Luiz Fernando Pereira 1
- Amanda Pestana 1
- Giada Pistilli 1
- Esther Ploeger 1
- Jeremy Qin 1
- Iasmin Rabelo 1
- Dragomir Radev 1
- Vipul Raheja 1
- Sofia Reinach 1
- Raquel Roza 1
- Mairon Samagaio 1
- Erick Santos 1
- Beatrice Savoldi 1
- Shanya Sharma 1
- Xudong Shen 1
- Natália Sathler Sigiliano 1
- Natalia S. Sigiliano 1
- Anna Beatriz Silva 1
- Mariana Souza Mota 1
- Karolina Stanczak 1
- Arjun Subramonian 1
- Kaiser Sun 1
- Eliza Szczechla 1
- Tair Djanibekov 1
- Zeerak Talat 1
- Deepak Tunuguntla 1
- Oskar Van Der Wal 1
- Emilio Villa-Cueva 1
- Adina Yakefu 1
- Kayo Yin 1
- Mike Zhang 1
- Sydney Zink 1
- Mateus Fonseca de Oliveira 1