2017
pdf
bib
abs
On the “Calligraphy” of Books
Vanessa Queiroz Marinho
|
Henrique Ferraz de Arruda
|
Thales Sinelli
|
Luciano da Fontoura Costa
|
Diego Raphael Amancio
Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing
Authorship attribution is a natural language processing task that has been widely studied, often by considering small order statistics. In this paper, we explore a complex network approach to assign the authorship of texts based on their mesoscopic representation, in an attempt to capture the flow of the narrative. Indeed, as reported in this work, such an approach allowed the identification of the dominant narrative structure of the studied authors. This has been achieved due to the ability of the mesoscopic approach to take into account relationships between different, not necessarily adjacent, parts of the text, which is able to capture the story flow. The potential of the proposed approach has been illustrated through principal component analysis, a comparison with the chance baseline method, and network visualization. Such visualizations reveal individual characteristics of the authors, which can be understood as a kind of calligraphy.
pdf
bib
abs
Enriching Complex Networks with Word Embeddings for Detecting Mild Cognitive Impairment from Speech Transcripts
Leandro Santos
|
Edilson Anselmo Corrêa Júnior
|
Osvaldo Oliveira Jr
|
Diego Amancio
|
Letícia Mansur
|
Sandra Aluísio
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Mild Cognitive Impairment (MCI) is a mental disorder difficult to diagnose. Linguistic features, mainly from parsers, have been used to detect MCI, but this is not suitable for large-scale assessments. MCI disfluencies produce non-grammatical speech that requires manual or high precision automatic correction of transcripts. In this paper, we modeled transcripts into complex networks and enriched them with word embedding (CNE) to better represent short texts produced in neuropsychological assessments. The network measurements were applied with well-known classifiers to automatically identify MCI in transcripts, in a binary classification task. A comparison was made with the performance of traditional approaches using Bag of Words (BoW) and linguistic features for three datasets: DementiaBank in English, and Cinderella and Arizona-Battery in Portuguese. Overall, CNE provided higher accuracy than using only complex networks, while Support Vector Machine was superior to other classifiers. CNE provided the highest accuracies for DementiaBank and Cinderella, but BoW was more efficient for the Arizona-Battery dataset probably owing to its short narratives. The approach using linguistic features yielded higher accuracy if the transcriptions of the Cinderella dataset were manually revised. Taken together, the results indicate that complex networks enriched with embedding is promising for detecting MCI in large-scale assessments.
2010
pdf
bib
Distinguishing between Positive and Negative Opinions with Complex Network Features
Diego Raphael Amancio
|
Renato Fabbri
|
Osvaldo Novais Oliveira Jr.
|
Maria das Graças Volpe Nunes
|
Luciano da Fontoura Costa
Proceedings of TextGraphs-5 - 2010 Workshop on Graph-based Methods for Natural Language Processing