Bootstrapping Sentiment Labels For Unannotated Documents With Polarity PageRank

Christian Scheible, Hinrich Schütze


Abstract
We present a novel graph-theoretic method for the initial annotation of high-confidence training data for bootstrapping sentiment classifiers. We estimate polarity using topic-specific PageRank. Sentiment information is propagated from an initial seed lexicon through a joint graph representation of words and documents. We report improved classification accuracies across multiple domains for the base models and the maximum entropy model bootstrapped from the PageRank annotation.
Anthology ID:
L12-1012
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1230–1234
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/124_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Christian Scheible and Hinrich Schütze. 2012. Bootstrapping Sentiment Labels For Unannotated Documents With Polarity PageRank. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1230–1234, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Bootstrapping Sentiment Labels For Unannotated Documents With Polarity PageRank (Scheible & Schütze, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/124_Paper.pdf