Cross-Lingual Classification of Topics in Political Texts

Goran Glavaš, Federico Nanni, Simone Paolo Ponzetto


Abstract
In this paper, we propose an approach for cross-lingual topical coding of sentences from electoral manifestos of political parties in different languages. To this end, we exploit continuous semantic text representations and induce a joint multilingual semantic vector spaces to enable supervised learning using manually-coded sentences across different languages. Our experimental results show that classifiers trained on multilingual data yield performance boosts over monolingual topic classification.
Anthology ID:
W17-2906
Volume:
Proceedings of the Second Workshop on NLP and Computational Social Science
Month:
August
Year:
2017
Address:
Vancouver, Canada
Editors:
Dirk Hovy, Svitlana Volkova, David Bamman, David Jurgens, Brendan O’Connor, Oren Tsur, A. Seza Doğruöz
Venue:
NLP+CSS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
42–46
Language:
URL:
https://aclanthology.org/W17-2906
DOI:
10.18653/v1/W17-2906
Bibkey:
Cite (ACL):
Goran Glavaš, Federico Nanni, and Simone Paolo Ponzetto. 2017. Cross-Lingual Classification of Topics in Political Texts. In Proceedings of the Second Workshop on NLP and Computational Social Science, pages 42–46, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
Cross-Lingual Classification of Topics in Political Texts (Glavaš et al., NLP+CSS 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-2906.pdf