Leveraging Multilingual Resources for Language Invariant Sentiment Analysis

Allen Antony, Arghya Bhattacharya, Jaipal Goud, Radhika Mamidi


Abstract
Sentiment analysis is a widely researched NLP problem with state-of-the-art solutions capable of attaining human-like accuracies for various languages. However, these methods rely heavily on large amounts of labeled data or sentiment weighted language-specific lexical resources that are unavailable for low-resource languages. Our work attempts to tackle this data scarcity issue by introducing a neural architecture for language invariant sentiment analysis capable of leveraging various monolingual datasets for training without any kind of cross-lingual supervision. The proposed architecture attempts to learn language agnostic sentiment features via adversarial training on multiple resource-rich languages which can then be leveraged for inferring sentiment information at a sentence level on a low resource language. Our model outperforms the current state-of-the-art methods on the Multilingual Amazon Review Text Classification dataset [REF] and achieves significant performance gains over prior work on the low resource Sentiraama corpus [REF]. A detailed analysis of our research highlights the ability of our architecture to perform significantly well in the presence of minimal amounts of training data for low resource languages.
Anthology ID:
2020.eamt-1.9
Volume:
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
Month:
November
Year:
2020
Address:
Lisboa, Portugal
Editors:
André Martins, Helena Moniz, Sara Fumega, Bruno Martins, Fernando Batista, Luisa Coheur, Carla Parra, Isabel Trancoso, Marco Turchi, Arianna Bisazza, Joss Moorkens, Ana Guerberof, Mary Nurminen, Lena Marg, Mikel L. Forcada
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
71–79
Language:
URL:
https://aclanthology.org/2020.eamt-1.9
DOI:
Bibkey:
Cite (ACL):
Allen Antony, Arghya Bhattacharya, Jaipal Goud, and Radhika Mamidi. 2020. Leveraging Multilingual Resources for Language Invariant Sentiment Analysis. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, pages 71–79, Lisboa, Portugal. European Association for Machine Translation.
Cite (Informal):
Leveraging Multilingual Resources for Language Invariant Sentiment Analysis (Antony et al., EAMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.eamt-1.9.pdf