More and more investors and machine learning models rely on social media (e.g., Twitter and Reddit) to gather information and predict movements stock prices. Although text-based models are known to be vulnerable to adversarial attacks, whether stock prediction models have similar vulnerability given necessary constraints is underexplored. In this paper, we experiment with a variety of adversarial attack configurations to fool three stock prediction victim models. We address the task of adversarial generation by solving combinatorial optimization problems with semantics and budget constraints. Our results show that the proposed attack method can achieve consistent success rates and cause significant monetary loss in trading simulation by simply concatenating a perturbed but semantically similar tweet.
This paper describes SChME (Semantic Change Detection with Model Ensemble), a method used in SemEval-2020 Task 1 on unsupervised detection of lexical semantic change. SChME uses a model ensemble combining signals distributional models (word embeddings) and word frequency where each model casts a vote indicating the probability that a word suffered semantic change according to that feature. More specifically, we combine cosine distance of word vectors combined with a neighborhood-based metric we named Mapped Neighborhood Distance (MAP), and a word frequency differential metric as input signals to our model. Additionally, we explore alignment-based methods to investigate the importance of the landmarks used in this process. Our results show evidence that the number of landmarks used for alignment has a direct impact on the predictive performance of the model. Moreover, we show that languages that suffer less semantic change tend to benefit from using a large number of landmarks, whereas languages with more semantic change benefit from a more careful choice of landmark number for alignment.
While the celebrated Word2Vec technique yields semantically rich representations for individual words, there has been relatively less success in extending to generate unsupervised sentences or documents embeddings. Recent work has demonstrated that a distance measure between documents called Word Mover’s Distance (WMD) that aligns semantically similar words, yields unprecedented KNN classification accuracy. However, WMD is expensive to compute, and it is hard to extend its use beyond a KNN classifier. In this paper, we propose the Word Mover’s Embedding (WME), a novel approach to building an unsupervised document (sentence) embedding from pre-trained word embeddings. In our experiments on 9 benchmark text classification datasets and 22 textual similarity tasks, the proposed technique consistently matches or outperforms state-of-the-art techniques, with significantly higher accuracy on problems of short length.
Visual language grounding is widely studied in modern neural image captioning systems, which typically adopts an encoder-decoder framework consisting of two principal components: a convolutional neural network (CNN) for image feature extraction and a recurrent neural network (RNN) for language caption generation. To study the robustness of language grounding to adversarial perturbations in machine vision and perception, we propose Show-and-Fool, a novel algorithm for crafting adversarial examples in neural image captioning. The proposed algorithm provides two evaluation approaches, which check if we can mislead neural image captioning systems to output some randomly chosen captions or keywords. Our extensive experiments show that our algorithm can successfully craft visually-similar adversarial examples with randomly targeted captions or keywords, and the adversarial examples can be made highly transferable to other image captioning systems. Consequently, our approach leads to new robustness implications of neural image captioning and novel insights in visual language grounding.