Towards Sentiment Analysis of Financial Texts in Croatian

Željko Agić, Nikola Ljubešić, Marko Tadić


Abstract
The paper presents results of an experiment dealing with sentiment analysis of Croatian text from the domain of finance. The goal of the experiment was to design a system model for automatic detection of general sentiment and polarity phrases in these texts. We have assembled a document collection from web sources writing on the financial market in Croatia and manually annotated articles from a subset of that collection for general sentiment. Additionally, we have manually annotated a number of these articles for phrases encoding positive or negative sentiment within a text. In the paper, we provide an analysis of the compiled resources. We show a statistically significant correspondence (1) between the overall market trend on the Zagreb Stock Exchange and the number of positively and negatively accented articles within periods of trend and (2) between the general sentiment of articles and the number of polarity phrases within those articles. We use this analysis as an input for designing a rule-based local grammar system for automatic detection of polarity phrases and evaluate it on held out data. The system achieves F1-scores of 0.61 (P: 0.94, R: 0.45) and 0.63 (P: 0.97, R: 0.47) on positive and negative polarity phrases.
Anthology ID:
L10-1603
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/876_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Željko Agić, Nikola Ljubešić, and Marko Tadić. 2010. Towards Sentiment Analysis of Financial Texts in Croatian. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Towards Sentiment Analysis of Financial Texts in Croatian (Agić et al., LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/876_Paper.pdf