Vanessa Queiroz Marinho


2017

pdf bib
On the “Calligraphy” of Books
Vanessa Queiroz Marinho | Henrique Ferraz de Arruda | Thales Sinelli | Luciano da Fontoura Costa | Diego Raphael Amancio
Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing

Authorship attribution is a natural language processing task that has been widely studied, often by considering small order statistics. In this paper, we explore a complex network approach to assign the authorship of texts based on their mesoscopic representation, in an attempt to capture the flow of the narrative. Indeed, as reported in this work, such an approach allowed the identification of the dominant narrative structure of the studied authors. This has been achieved due to the ability of the mesoscopic approach to take into account relationships between different, not necessarily adjacent, parts of the text, which is able to capture the story flow. The potential of the proposed approach has been illustrated through principal component analysis, a comparison with the chance baseline method, and network visualization. Such visualizations reveal individual characteristics of the authors, which can be understood as a kind of calligraphy.

pdf bib
NILC-USP at SemEval-2017 Task 4: A Multi-view Ensemble for Twitter Sentiment Analysis
Edilson Anselmo Corrêa Júnior | Vanessa Queiroz Marinho | Leandro Borges dos Santos
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper describes our multi-view ensemble approach to SemEval-2017 Task 4 on Sentiment Analysis in Twitter, specifically, the Message Polarity Classification subtask for English (subtask A). Our system is a voting ensemble, where each base classifier is trained in a different feature space. The first space is a bag-of-words model and has a Linear SVM as base classifier. The second and third spaces are two different strategies of combining word embeddings to represent sentences and use a Linear SVM and a Logistic Regressor as base classifiers. The proposed system was ranked 18th out of 38 systems considering F1 score and 20th considering recall.