Tracking And Understanding Public Reaction During COVID-19: Saudi Arabia As A Use Case

The coronavirus disease of 2019 (COVID-19) has a huge impact on economies and societies around the world. While governments are taking extreme measures to reduce the spread of the virus, people are getting affected by these new measures. With restrictions like lockdown and social distancing, it became important to understand the emotional response of the public towards the pandemic. In this paper, we study the reaction of Saudi Arabia citizens towards the pandemic. We utilize a collection of Arabic tweets that were sent during 2020, primarily through hashtags that were originated from Saudi Arabia. Our results showed that people had kept a positive reaction towards the pandemic. This positive reaction was at its highest at the beginning of the COVID-19 crisis and started to decline as time passes. Overall, the results showed that people were so supportive of each other through this pandemic. This research can help researchers and policymakers in understanding the emotional effect of a pandemic on societies.


Introduction
COVID-19 was declared a pandemic by the World Health Organization (WHO) on January 30, 2020 (Organization et al., 2020). The spread of COVID-19 around the world has raised many concerns and uncertainties about the future. Several epidemic periods have been observed in the world. In recent years, epidemics have increased as globalization facilitates contagion. The current pandemic has affected 213 countries, with more than one and a half million confirmed cases to date (Organization et al., 2020).
To overcome the rapid spread of the virus, the World Health Organization (WHO) has suggested that isolation and self-quarantine are among the major ways to stop this pandemic from spreading at such an alarming rate (Cucinotta and Vanelli, 2020). China has witnessed the benefits of one of the largest lockdowns at the start of this pandemic, where it locked down 20 provinces and regions (Koh, 2020).
One of the most widely used social media platforms is Twitter, popular for its accessibility and ease of information sharing (Wang et al., 2016). With the increase in precautionary measures such as lockdowns and curfews, people have turned to social media platforms to continue their social interactions, with a 61% increase in the usage of these platforms (Holmes, 2020). With 13.8 million active users out of 24 million internet users (91% of the total population), Saudi Arabia is among the countries with the highest number of Twitter users among its online population (Clement, 2020;Puri-Mirza, 2019). Moreover, Saudi Arabia is producing 40% of all tweets in the Arab world (Mourtada and Salem, 2014).
Since a majority of online communication is recorded in the form of text data, measuring the emotions around COVID-19 will be a central part of understanding and addressing the impacts of the COVID-19 situation on people. Moreover, the rapid spread of COVID-19 infections has created a strong need for understanding the development of mass sentiment in pandemic scenarios. Because of the uncertainty that is provoked by the COVID-19 situation and the many and long-lasting preventative measures applied, it is of vital importance to understand how governments, NGOs, and social organizations can help those who are most affected by the situation and where they are located. In order to do so, understanding the emotions, worries, and concerns that people have and possible coping strategies is important.
In this study, we try to understand Saudis' reactions towards this pandemic, by answering three questions: -What is the overall sentiment of Saudi citizens towards the pandemic? -What is the geographical distribution of different sentiments? -How have Saudis reacted to different topics/decisions related to the pandemic?
Analyzing emotions regarding COVID-19 restrictions will be a central part of understanding and addressing the impacts of pandemics on people. Moreover, this study might inform the development of models of social behavior that fit the Saudi Arabian culture. Our results showed a high level of solidarity within society, where people were expressing positive sentiments in hashtags related to social solidarity. Also, our results showed there was a high correlation between the number of COVID-19 cases and the geolocation of the tweets.
In the remainder of the paper, we first synthesize previous work and background information and relate that to our work. After that, we describe the dataset used for the analysis. We then discuss sentiment identification. Finally, we discuss the findings, conclusions, and proposed directions for future research.
2 Related Work

Sentiment Analysis
There is a technique related to natural language processing (NLP) known as sentiment analysis that helps detect and extract the polarity of sentiments from text by determining if the text is positive, negative, or neutral (García-Díaz et al., 2020). There are two approaches to sentiment analysis: machine learning and lexicon-based. The machine learning approach uses algorithms to find sentiment, while the lexicon-based approach counts positive and negative words (Drus and Khalid, 2019).
Social media is widely used by the public to express opinions, ideas, and emotions, thus allowing researchers to analyze users' sentiments (Bhat et al., 2020). In Drus and Khalid (2019), researchers divided social media platforms into four categories depending on their use: content communities such as YouTube and Instagram, social networking sites such as Facebook and LinkedIn, blogs, and microblogs as Twitter and Tumblr. Of these types, the top category for collecting information was microblogs, especially Twitter, as it is considered as one of the most popular websites and allows users to post only short messages (Öztürk and Ayvaz, 2018), with only 280 characters for each tweet (Pokharel, 2020). Also, microblogging attracts researchers focusing on various languages. Twitter publishes around 500 million daily tweets; in comparing the number of active Twitter users and overall internet users in the world, Twitter users number more than 22% of the overall internet population (Öztürk and Ayvaz, 2018). There were two other main reasons for choosing Twitter in García-Díaz et al. (2020): (1) It is a popular platform for spreading news and information, being established as an appropriate means to collect information about public health; and (2) the hashtags common to tweets help the social network act as a hub-and-spoke.
To highlight the major role of sentiment analysis in a crisis, the research conducted byÖztürk and Ayvaz (2018) explored sentiments about civil war in Syria as well as the refugee crisis, using Twitter data consisting of 2,381,297 tweets in Turkish and English. The reason for selecting Turkish was the huge number of Syrian refugees based in Turkey. The final result demonstrates that Turkish tweets expressed positive sentiments about Syrians and refugees at a rate of 35%, while English tweets contained 12% positive sentiments.
Furthermore, in García-Díaz et al. (2020), an aspect-based sentiment analysis is used to define the tweets' sentiments by using deep-learning models with other techniques, such as word-embedding and linguistic features, on a Spanish corpus about infectious diseases such as Dengue, Zika, or Chikungunya viruses in Latin America. The corpus includes 10,843 positive tweets, 10,843 negative tweets, and 7,659 neutral tweets. The researcher of Adel and Wang (2019) provided the first Arabic corpus for crisis-response messages, as there is no Arabic corpus associated with humanitarian crisis although most humanitarian crises are in the Arab countries. Thus, this corpus was created based on Twitter data and includes tweets about the cholera epidemic and starvation in Yemen as well as the Syrian refugee crisis. The process of creating the corpus followed six steps: (1) Collecting data; (2) Annotation process of Arabic tweets; (3) Prepossessing of text; (4) Feature extraction; (5) Model building; and (6) Model evaluation. Furthermore, in the supervised machine learning model used to predict the label for Arabic tweets, three experiments were conducted using SVM, NB, and Random Forest (RF), with the best classifier result revealed by RF. Aloqaily et al. (2020) used two main approaches of sentiment analysis, lexicon-based and machine learning, on Arabic tweets about the civil war and crises in Syria. The results demonstrated that machine learning performs better than the lexiconbased approach. Regarding machine learning, five algorithms were used: the Logistic Model Trees (LMT), simple logistic, SVM, DT, Voting-based and k-Nearest Neighbor (K-NN). The best results were produced by LMT with Accuracy = 85.55, F1= 0.92 and Area Under Curve (AUC)= 0.86.

Sentiment Analysis during an epidemic
In a sentiment analysis of 615 tweets from Nepal, using the hashtags COVID-19 and coronavirus, the final result showed that about 58% of users' tweets were optimistic, 15% were negative, and 27% were neutral (Pokharel, 2020). On the other hand, Barkur and Vibha (2020) presented a sentiment analysis of Indians' tweets about the lockdown by collecting data from Twitter using two hashtags: #IndiaLockdown and #IndiafightsCorona. This resulted in 24,000 tweets dominated by positive sentiments. In another study, researchers collected 530,000 tweets by using the keywords coronavirus and COVID-19. More than 36% of users posted positive tweets, while the negative tweets numbered only 14%, and the neutral tweets accounted for 50% (Manguri et al., 2020). Further, Bhat et al. (2020) used two hashtags to perform sentiment analysis related to the current coronavirus pandemic: #COVID-19 resulted in 92,646 collected tweets, and #Coronavirus resulted in 85,513 collected tweets. Regarding the #COVID-19 hashtag, most tweets had positive sentiments (51.97%). Neutral sentiments represented 34.05% of tweets, and 13.96% relayed negative sentiments. On the other hand, sentiments related to #Coronavirus were more neutral (41.27%). Positive tweets accounted for 40.91% of the overall numbers and 17.80% for negative sentiments. Overall, the results indicate that users' opinions were mostly positive or neutral.

Arabic Sentiment Analysis during epidemics
The authors of (Baker et al., 2020) suggested a sentiment analysis system based on 54,065 Arabic tweets related to influenza and labeling the data as valid or invalid based on their relation to influenza. They also used different machine learning techniques such as support vector machine (SVM), naïve Bayes (NB), K-nearest neighbor (k-NN), and decision tree (DT).  (2020) to study the attitude of Saudis related to COVID-19 preventive processes by applying Arabic sentiment analysis using Twitter posts, the NB algorithm was applied to 53,127 tweets collected based on seven health measures: -Great mosque closures -Qatif lockdown -School and university closures -Public park, restaurant, and mall closures -Suspension of all sports activities -Suspension of Friday prayers -21-day curfew The final conclusion of this study is that Saudi Twitter users are willing to support and have positive opinions of health measures announced by the Saudi government. Religious practices also play an important role by increasing the overall positive sentiment.

Data
Our primary dataset is a repository of Tweet IDs. Each corresponds to Arabic content posted on Twitter and related to the coronavirus pandemic (Addawood, 2020). The repository contains 3.8 million Tweet IDs for the period of January 1, 2020, to April 10, 2020. Tweets were compiled utilizing Crimson Hexagon, which is a social media analytic platform that provides paid data stream access. The data was collected by identifying a list of trending hashtags and keywords mostly used by the public. The hashtags were categorized based on their meaning, how users were interacting with it, and who started the hashtags. The categorization was done by Addawood (2020). Table 1 shows a sample of the hashtags.
To collect a comprehensive dataset, 70 keywords and hashtags were categorized based on how they were oriented and used in conversations. Table 2 shows the hashtag topics with the number of tweets for each topic. The data was divided into four time periods as shown in table 3 to understand public opinion as the pandemic evolved.

Sentiment Lexicon
Before working on identifying sentiments, the dataset needed to be cleaned. The cleaning process included removing usernames, punctuation, numeric characters, links, English characters, stop words, and normalizing the Arabic text. After measuring the average length of characters before and after the cleaning process, we found that the average length of tweets before cleaning was 184.63 characters and 142.23 characters after cleaning. Sentiment analysis techniques allowed researchers and businesspeople to determine the different viewpoints expressed in social media text. The availability of a comprehensive Arabic sentiment lexicon is limited, so we built our own lexicon. The lexicon was built using two methods. The first method was by using previously constructed Arabic sentiment lexicons (Al-Thubaity et al., 2018;Al-Twairesh et al., 2016;Salameh et al., 2015). These lexicons either did not contain words specifically suited to Saudi dialect or were domain-specific. Our goal was to be able to construct a Saudi dialect sentiment lexicon that can generalize to other domains. The second method used to build the lexicon was by utilizing different textual data sources in Arabic such as Twitter data (Nabil et al., 2015), book reviews (Aly and Atiya, 2013), and hotel reviews (Elnagar et al., 2018). For each of these datasets, the first two authors conducted the annotation process, in which they manually labelled each textual data point and extracted words expressing emotions. These data sets were divided between them and each data set was annotated by one annotator. The authors are currently working on developing a comprehensive Saudi dialect lexicon with nine annotators. The resulted lexicon contained 7,534 words, of which 1,808 (24%) were labeled as positive and 5,726 (76%) were labeled as negative.

Identification of Author Location
The globalization of the pandemic allowed people from all over the world to participate in this discussion. To find out where Twitter users were located, the location was available in the dataset and it was retrieved as provided by Crimson Hexagon. To infer locations, Crimson Hexagon uses two types of information: 1) geotagged locations, which are only available for about 1% of Twitter data (Jurgens et al., 2015;Morstatter et al., 2013); and 2) for tweets that are not geotagged, an estimation of users' countries, regions, and cities based on "various pieces of contextual information, for example, their profile information", as well as users' time zones and languages.

Sentiment Identification
Our goal is to have a full understanding of public reactions to the pandemic in Saudi Arabia. To be able to identify the sentiment of each tweet in our dataset, we used a machine learning approach. Because the dataset is huge, we selected a sample of the dataset to annotate based on the created lexicon. The sample was chosen based on selecting tweets with a text length between 160 and 170 characters. The average length of all tweets was 184.63 characters before cleaning and 142.23 characters after cleaning. After reviewing the whole data, we decided to take a small sample of it for sentiment  identification because we will use that sample to train the machine learning models. Instead of taking tweets randomly, we wanted to use tweets from a normal Twitter user to be meaningful as a sample to train our sentiment model, so we measured the average and tried to take tweets that are around it since this is the number of characters most people tweet. As mentioned before, the average number of characters was 184.63 before cleaning and 142.23 after cleaning. Then, we used the sample between these two numbers in order to get tweets with an average length of characters to identify sentiment. The total number of tweets in the sample is 129,391. By comparing the lexicon words to this dataset, we were able to form a sentiment value for each tweet. The resulting labels are shown in table 4 below.
Each tweet was annotated by checking each word in the tweet and comparing it with the lexicon dataset we have. The lexicon dataset labeled each word either positive or negative. We changed that to 1 and -1. For each word in the tweet, we added 1 to the tweet score if it was positive, subtracted 1 if it was negative, and skipped it if it was not in the dataset. Finally, if the tweet score was greater than zero we gave it a positive label, if it was less than zero we gave it a negative label, and if it was equal to zero it was labeled as neutral.
Binary classification models were built to be able to identify the sentiment label in the rest of the dataset. To build the models, we used two off-the-shelf machine learning algorithms, Naive Bayes and Support Vector Machine, and trained classifiers using Stratified 10-fold cross-validation with a ratio of 20:80 for testing and training. The model was built using only uni grams as features. Table 5 below shows the classification results for each model.

What is the overall sentiment towards the pandemic?
To have a better understanding of public opinion towards the pandemic, we studied the sentiment people expressed on social media from the beginning of the pandemic in January 2020 until Saudi Arabia announced lockdown measures in April 2020. Figure 1 shows the distribution of sentiment across time periods. The results of the Support Vector Machine model and the Naive Bayes model are shown in table 4. We chose the one with better results and applied it to the whole dataset. The results in figure 1 and figure 2 are from the Support Vector Machine model.      Figure 1 shows that most tweets expressed a neutral sentiment, which may have happened because most of the tweets contained prayers that do not express negative nor positive emotions.
Even though the sentiment distribution between time periods is very close, we can see that the first two periods had more positive sentiment compared to later periods. This might have happened because people might had high hopes that this virus was temporary and would not stay for long, which was eventually proven wrong. People expressed fewer positive emotions in the last period. This may be because the country started implement-ing precautionary measures such as the lockdown that started on March 23rd, 2020. Moreover, there was not much of a change in the expression of negative feelings even though the situation was escalating. That might because the country implemented a good strategy in assuaging people's concerns through spreading awareness materials and information through social media and other outlets. Other factors may include the high trust the public has towards their government and the validity of these precautionary measures.

How have people reacted to different topics?
To measure the public reactions to this pandemic, we used hashtag categorization in the dataset. The topics included showing social solidarity, discussing precautionary measures, supporting decisions taken by the government, or using hashtags populated by the government to raise awareness and spread more information about the virus. The topics were labeled manually and based on the meaning and purpose of the hashtag. Table 1 shows a sample of the hashtags and their topics. Figure  2 shows the distribution of sentiment across different topics. The results show that people expressed more positive sentiments when they are supporting each other through the pandemic. These include hashtags such as (translated from Arabic) "Let's Separate for our health", "Distance does not separate us", "Oh Lord, raise us from calamity", and "Our merchants are good".

What is the geographical distribution of sentiment?
At 830,000 square miles, Saudi Arabia is a geographically large country. The cities of the country are spread out with huge distances between them. Out of the total dataset, only 42,238 had a location;  Table 6 demonstrates the regions and their sentiments. As a result, positive sentiment has a higher percentage in most regions except Jazan and Najran. These two regions show negative sentiment with 51% and 59% ,respectively. Hence, they did not impact the overall sentiment as they populated few tweets compared to the rest of the regions. The tweets coming from Najran represent only 0.4% of the overall dataset, while Jazan represents 0.5% as shown in table 7. After analyzing the regions' sentiment, we analyzed Saudi cities to get a better understanding of sentiment. Figure 3 shows the amount of tweets and the distribution of sentiment. From Table 7, we can see that there is a high amount of tweets populated from the capital city of Riyadh, with 42% of tweets expressing negative sentiment and 58% expressing positive sentiment. The second city is Jeddah, with 43% of tweets expressing negative sentiment and 57% expressing positive sentiment. Third comes Dammam with 41% of tweets express-ing negative sentiment and 59% expressing positive sentiment. The fourth one is Makkah, with 40% of tweets with negative sentiment and 60% with positive sentiment. Table 7 shows additional cities, their sentiment, and percentage of total.
The highest city in expressing negative emotions is Al-Qatif governorate, which is located in the Eastern province and where the first case of COVID-19 in the country was discovered. The city's tweets only represent 2.7% of the total tweets of the Eastern province and 0.2% of the entire dataset, thus this negativity does not have a strong effect on the final sentiment.

Ethical Considerations
This research was conducted respecting the collected tweets and their authors' opinions. We also ensured that all tweets were collected without any modifications to their context, location, or all other related attributes. Moreover, the annotation process was done using several lexicons to identify the tweets' polarity in order to avoid researcher intervention.
The shared results are objective, transparent, and unbiased, so they meet the researchers' requirements. Furthermore, this research does not discuss any political arguments. Rather, it mainly focuses on citizens' opinions of the precautionary measures applied in Saudi Arabia by exploring the most popular hashtags that circulated in Twitter during the studied time period.
This research can be used for applications that might affect people negatively. It can be used to manipulate public opinion and drive them toward specific political agendas through a better understanding of people's reactions toward political topics. Moreover, when applying a new dataset that differs from our original data with new domains or a less commonly presented population, the models used in this study may degrade. Thus, the results of the evaluation metrics can differ due to the biases brought by the new dataset since our model's results, as demonstrated in Table 5, are based on our dataset Li et al. (2020).

Conclusion
In this research, our goal was to get a better understanding of Saudis' reactions to the coronavirus. To do so, we explored the sentiments shared by the public with a focus on the analysis of Saudi dialect. Such understanding can contribute to fast-   response policymaking in the early stages of a crisis. To achieve this goal, we built our own lexicon with 7,534 sentiment words, 1,808 of which were labeled as positive and 5,726 were labeled as negative. After that, we ran the lexicon on a subset of the data. Then, two machine learning algorithms were used: NB and SVM with 10-fold cross-validation and ratio of 20:80 for both training and testing. The overall accuracy of classifiers was good. The SVM classifier was higher than NB with 98%, while the accuracy of NB was 87%. Our results showed that people had more optimistic sentiment and a high support for both the government's decisions and precautionary measures to overcome the virus. However, one of the major limitations is the lack of resources to be able to analyze Arabic text, especially given there are different Arabic dialects for each region. As for future work, we plan to understand the correlation between the different time periods and sentiment. We also plan to get access to an extended version of the dataset to continue the analysis. Moreover, the study of expressed emotions such as fear and happiness is another direction to take such research.