Ahmad Muhyiddin Yusof


pdf bib
XLNET-GRU Sentiment Regression Model for Cryptocurrency News in English and Malay
Nur Azmina Mohamad Zamani | Jasy Suet Yan Liew | Ahmad Muhyiddin Yusof
Proceedings of the 4th Financial Narrative Processing Workshop @LREC2022

Contextual word embeddings such as the transformer language models are gaining popularity in text classification and analytics but have rarely been explored for sentiment analysis on cryptocurrency news particularly on languages other than English. Various state-of-the-art (SOTA) pre-trained language models have been introduced recently such as BERT, ALBERT, ELECTRA, RoBERTa, and XLNet for text representation. Hence, this study aims to investigate the performance of using Gated Recurrent Unit (GRU) with Generalized Autoregressive Pretraining for Language (XLNet) contextual word embedding for sentiment analysis on English and Malay cryptocurrency news (Bitcoin and Ethereum). We also compare the performance of our XLNet-GRU model against other SOTA pre-trained language models. Manually labelled corpora of English and Malay news are utilized to learn the context of text specifically in the cryptocurrency domain. Based on our experiments, we found that our XLNet-GRU sentiment regression model outperformed the lexicon-based baseline with mean adjusted R2 = 0.631 across Bitcoin and Ethereum for English and mean adjusted R2 = 0.514 for Malay.