Computing with Subjectivity Lexicons
Caio L. M. Jeronimo | Claudio E. C. Campelo | Leandro Balby Marinho | Allan Sales | Adriano Veloso | Roberta Viola
Proceedings of the Twelfth Language Resources and Evaluation Conference
In this paper, we introduce a new set of lexicons for expressing subjectivity in text documents written in Brazilian Portuguese. Besides the non-English idiom, in contrast to other subjectivity lexicons available, these lexicons represent different subjectivity dimensions (other than sentiment) and are more compact in number of terms. This last feature was designed intentionally to leverage the power of word embedding techniques, i.e., with the words mapped to an embedding space and the appropriate distance measures, we can easily capture semantically related words to the ones in the lexicons. Thus, we do not need to build comprehensive vocabularies and can focus on the most representative words for each lexicon dimension. We showcase the use of these lexicons in three highly non-trivial tasks: (1) Automated Essay Scoring in the Presence of Biased Ratings, (2) Subjectivity Bias in Brazilian Presidential Elections and (3) Fake News Classification Based on Text Subjectivity. All these tasks involve text documents written in Portuguese.