Zoom Out and Observe: News Environment Perception for Fake News Detection

Fake news detection is crucial for preventing the dissemination of misinformation on social media. To differentiate fake news from real ones, existing methods observe the language patterns of the news post and “zoom in” to verify its content with knowledge sources or check its readers’ replies. However, these methods neglect the information in the external news environment where a fake news post is created and disseminated. The news environment represents recent mainstream media opinion and public attention, which is an important inspiration of fake news fabrication because fake news is often designed to ride the wave of popular events and catch public attention with unexpected novel content for greater exposure and spread. To capture the environmental signals of news posts, we “zoom out” to observe the news environment and propose the News Environment Perception Framework (NEP). For each post, we construct its macro and micro news environment from recent mainstream news. Then we design a popularity-oriented and a novelty-oriented module to perceive useful signals and further assist final prediction. Experiments on our newly built datasets show that the NEP can efficiently improve the performance of basic fake news detectors.


Introduction
The wide spread of fake news on online social media has influenced public trust (Knight Foundation, 2018) and poses real-world threats on politics (Fisher et al., 2016), finance (ElBoghdady, 2013), public health (Naeem and Bhatti, 2020), etc. Under such severe circumstances, automatically detecting fake news has been an important countermeasure in practice. Besides directly observing the post's content patterns (Volkova et al., 2017;Wang et al., 2018) (Figure 1(a)), most existing methods for fake news detection "zoom in" for finding richer post-level signal by checking user replies to the post (Shu et al., 2019a;Zhang et al., 2021) and verifying the claim with knowledge sources (Popat et al., 2018; (Figure 1(b)). However, these methods neglect a different line of "zooming out" to observe the external news environment where a fake news post is created and disseminated. Our starting point is that a news environment, which represents recent mainstream media opinion and public attention, is an important inspiration of the fabrication of contemporary fake news. Since any gains of fake news achieve only if it widely exposes and virally spreads, a fake news creator would carefully design how to improve the post's visibility and attract audiences' attention in the context (environment) of recently published news. Such intentional design connects fake news with its news environment and conversely, we might find useful signals from the news environment to better characterize and detect fake news. Figure 2 shows an example, where we name the Figure 2: A fake news post p and its news environment containing recent news items in three days (2019/11/12 to 2019/11/14). Only the items in events that are reported multiple times (differentiated by dot colors) are displayed for brevity. We can see that p falls in a popular event on a Syria-China World Cup qualifier compared with other events and focuses on a novel aspect (unusual celebration in Syria).
whole set of recent news items the macro news environment and the event-similar subset as the micro news environment. For the fake news post p on Syria's ceasefire thanks to a win over China in a football match, we observe two important signals from its news environments: 1) Popularity. In the macro news environment that contains all recent news items, p is related to a relatively popular event (Syria-China football match) among the five events in different domains. This would bring p greater exposure and further greater impact.
2) Novelty. In the micro news environment, the items mostly focus on the game itself (e.g., "Wu Lei had a shot"), while p provides novel side information about Syria's unusual celebration. This would help catch audiences' attention and boost the spread of p (Vosoughi et al., 2018).
Unfortunately, these potentially useful signals could be hardly considered by post-only and "zoom-in" methods, as they focus on digging in the direction towards inherent properties of a single post (e.g., styles, emotions and factual correctness), rather than observing the surrounding environments of the post.
To enable fake news detection systems to exploit information from news environments, we propose the News Environment Perception Framework (NEP). As presented in Figure 3, for the post p, we construct two news environments, MACROENV and MICROENV, using recent mainstream news data to facilitate the perception from different views. We then design a popularity-oriented and a novelty-oriented perception module to depict the relationship between p and these recent news items.
The environment-perceived vectors are fused into an existing fake news detector for prediction.
Our contributions are as follows: • Problem: To the best of our knowledge, we are the first to incorporate news environment perception in fake news detection. • Method: We propose the NEP framework which exploits the perceived signals from the macro and micro news environments of the given post for fake news detection. • Data & Experiments: We construct the first dataset which includes contemporary mainstream news data for fake news detection. Experiments on offline and online data show the effectiveness of NEP.

Related Work
Fake news detection is mostly formulated as a binary classification task where models are expected to accurately judge the given post as real or fake. Existing works focus on discovering distinctive features in the post from various aspects as Figure 2 shows, which we roughly group them as: Post-only methods aim at finding shared patterns in appearances across fake news posts (Figure 1(a)). Text-based studies focus on better constructing features based on sentiment (Ajao et al., 2019), writing style (Przybyla, 2020), language use (Volkova et al., 2017), discourse (Karimi and Tang, 2019), etc. Other works rely on deep neural models to encode contents and handle certain scenarios, such as visual-based (Qi et al., 2019;Cao et al., 2020), multi-modal (Wang et al., 2018; and multi-domain (Nan et al., 2021) detection. Our NEP provides additional news environmental information and can coordinate with post-only methods (will show in Section 4).
"Zoom-in" methods introduce related sources to understand the post delicately. One line is to use social contexts (bottom of Figure 1(b)). Some directly analyze the network information to find patterns shaped by user relationship and information diffusion (Shu et al., 2019b;Zhou and Zafarani, 2019;Nguyen et al., 2020;Silva et al., 2021), and others leverage collective wisdom reflected by user responses Kochkina et al., 2018;Shu et al., 2019a;Zhang et al., 2021). For example, a refuting reply saying "FYI, this is false" would be an important reference to make a prediction. Another line refers to knowledge sources (top of Figure 1(b)) and aims at verifying the post with retrieved evidence for detection. The knowledge sources can be webpages (Popat et al., 2018;Ma et al., 2019;Vo and Lee, 2021;Wu et al., 2021;Sheng et al., 2021b), knowledge graphs (Cui et al., 2020), online encyclopedias (Thorne et al., 2018;Aly et al., 2021), fact-checking article bases (Augenstein et al., 2019;Shaar et al., 2020), etc. Our NEP starts from a different view, for it "zooms out" to observe the news environment where the post spreads. Note that our method is not equivalent to a knowledge-based method that uses news environments as evidence bases, as it does not pick evidential news items to prove or disprove the given post, but aims at reading the news "atmosphere" when the post is published. In that sense, "zoom-in" and "zoom-out" methods can actually be integrated for comprehensively detecting fake news (will also show in Section 4).

Proposed Method
Figure 3 overviews our proposed framework NEP, whose goal is to empower fake news detectors with the effective perception of news environments. Given a post p, we first construct its macro and micro environment (MACROENV and MICROENV) using recent news data. Then we model the postenvironment relationships to generate environment-perceived vectors v p,mac and v p,mic . Finally, the two vectors are fused with post representation o derived from the fake news detector to predict if p is real or fake.

News Environment Construction
The environment is the objects, circumstances, or conditions by which one is surrounded (Merriam-Webster, 2021). Accordingly, a news environment should contain news reports which can reflect the present distribution of mainstream focuses and audiences' attention. To this end, we collect news items published by mainstream media outlets as basic environmental elements, in that their news reports generally face a large, common audience.
Let E be the set of all collected news items published earlier than p. We construct a macro environment (MACROENV) and a micro environment (MICROENV), which are defined as follows: • MACROENV is the set of news items in E released within T days before p is published: where t p and t e respectively denote the publication date of p and the news item e. • MICROENV is the set of news items in E mac that are relevant to p. Here, we query E mac using p and obtain the top k as the set: where k = r|E mac | and r ∈ (0, 1) determines the proportion. Intuitively, the time-constrained environment MACROENV provides a macro perspective of what the mass audience read and focus on recently, while the further relevance-constrained one MICROENV describes the distribution of items about similar events. We use a pretrained language model M (e.g., BERT (Devlin et al., 2019)) to obtain the post/news representation. For p or each item in the macro/micro environment e, the initial representation is the output of M for the [CLS] token: (3)

News Environment Perception
The perception of news environments of p is to capture useful signals from existing mainstream news items. The signals are expected to discover unique post-environment interactive patterns of fake news. Starting from the motivation of fake news creators to widely diffuse fabricated information to the whole online news ecosystem, we guide the model to perceive from two important diffusion-related perspectives, i.e., popularity and novelty, in the MACROENV and the MICROENV.

Popularity-Oriented MACROENV Perception.
A fabricated post would be more likely to go viral and thus gain more influence when it is related to trending news. Thus, a fake news creator might consider how to chase clouts of hot events during writing a fake news post. Here we consider how popular the main event of p is in the MACROENV. We transform the perception of popularity into the similarity estimation between p and individual news items. That is, if many items in the MACROENV are similar to p, then p might be also popular in such an environment. Following (Reimers and Gurevych, 2019), we first calculate cosine similarity between p and each news item (say, i) in E mac : The similarity list {cos(p, e i )} of variable length |E mac | does not work well with networks mostly taking fixed-dimensional vectors as inputs. Thus, the list requires a further transformation, where we expect the transformed environmentperceived vector to reflect how similar p is to the environment without much information loss. Following (Xiong et al., 2017;Liu et al., 2020), we here choose to calculate a soft counting on the list to obtain a distribution that mimics a hard bin plot. Specifically, we employ a Gaussian Kernel Pooling proposed in (Xiong et al., 2017) across the range of cosine similarity to get soft counting values. Assuming that we use C kernels {K i } C i=1 , the output of k-th kernel is: where µ k and σ k is the mean and width of the kth kernel. In Eq. (5), if the similarity between p and e is close to µ k , the exponential term will be close to 1; otherwise to 0. We then sum the exponential terms with Eq. (6). This explains why a kernel is like a soft counting bin of similarities. We here scatter the means {µ k } C k=1 of the C kernels in [−1, 1] to completely and evenly cover the range of cosine similarity. The widths are controlled by {σ k } C k=1 . Appendix B.1 provides the details. A C-dim similarity feature in the MACROENV is obtained by concatenating all kernels' outputs and normalizing with the summation of the outputs: (7) where is the concatenation operator and Norm(·) denotes the normalization.
By calculating K(p, E mac ), we obtain a soft distribution of similarities between p and the MACROENV as the perception of popularity. To enrich the perceived information, we generate the MACROENV-perceived vector for p by fusing the similarity and semantic information. Specifically, we aggregate the post vector, the center vector of the MACROENV m(E mac ) (by averaging all vectors), and the similarity feature using an MLP: v p,mac = MLP(p⊕m(E mac )⊕K(p, E mac )). (8) Novelty-Oriented MICROENV Perception. Different from MACROENV, MICROENV contains mainstream news items close to p, which indicates that they are likely to share similar events. However, even in a popular event, a post may still be not attended if it is too similar to others. Vosoughi et al. (2018) found that false news was more novel than true news on Twitter with the reference to the tweets that the users were exposed to (could be regarded as a user-level news environment). This might explain why fake news spread "better". We thus consider how novel p is in the event-similar MICROENV. 2 If the content of a post is novel, it is expected to be an outlier in such an event. Here, we use the center vector m(E mic ) of MICROENV as a reference. Specifically, we again use Eqs. (5) to (7), but here, calculate two similarity features K(p, E mic ) and K(m(E mic ), E mic ). The latter serves as a reference for the former and facilitates the model "calibrate" its perception. The generation of the MI-CROENV-perceived vector for p is as follows: where the comparison function g(x, y) = (x y) ⊕ (x − y) and is the Hadamard product operator. u sem and u sim respectively aggregate the semantic and similarity information. The MLPs are individually parameterized. We omit their index numbers in the above equations for brevity.

Prediction under Perceived Environments
As our environment perception does not necessarily depend on a certain detection model, we expect our NEP to have a good compatibility with various fake news detectors. In our NEP, we achieve this by gate fusion. Take a post-only detector as an example. We apply the gate mechanism for adaptively fusing v p,mac and v p,mic according to o: where the gating vector g = sigmoid(Linear(o ⊕ v p,mac )), sigmoid is to constrain the value of each element in [0, 1], and o denotes the last-layer feature from a post-only detector. 3 o and v p are further fed into an MLP and a softmax layer for final prediction: When working with more complex detectors that rely on other sources besides the post, we can simply concatenate those feature vectors in Eq. (13). For example, we can concatenate v p with the postarticle joint representation if the fake news detector is knowledge-based. During training, we minimize the cross-entropy loss.

Experiment
We conduct experiments to answer the following evaluation questions: • EQ1: Can NEP improve the performance of fake news detection? • EQ2: How effective does the NEP model the macro and micro news environments? • EQ3: In what scenarios do news environments help with fake news detection?

Datasets
We integrated existing datasets in Chinese and English and then collected news items released in the corresponding time periods. The reasons why we do not use a single, existing dataset include 1) no existing dataset provides the contemporary news items of verified news posts to serve as the elements in news environments; 2) most datasets were collected in a short time period and some suffer from a high class imbalance across years. 4 The statistics are shown in Table 1 and the details are as follows: Chinese Dataset Post: We merged the non-overlapping parts of multiple Weibo datasets from (Ma et al., 2016) (excluding those unverified), (Song et al., 2019), (Zhang et al., 2021) and (Sheng et al., 2021a) to achieve a better coverage of years and avoid spurious correlation to specific news environments (e.g., one full of COVID-19 news). To balance the post amount of real/fake classes across the years, we added news posts verified by a news verification system NewsVerify 5 and resampled the merged set. The final set contains 39,066 verified posts on Weibo ranging from 2010 to 2021. News Environment: We collected the news items from the official accounts of six representative mainstream news outlets that have over 30M followers on Weibo (see sources in Appendix A). The further post-processing resulted in 583,208 news items from 2010 to 2021.

English Dataset
Post: Similarly, we merged the datasets from (Kochkina et al., 2018) (excluding unverified), (Augenstein et al., 2019) (excluding those without claim dates), and (Shaar et al., 2020). For posts or claims from fact-checking websites, we used the provided claim dates instead of the publication dates of the fact-checking articles, to avoid potential data contamination where the later news environment is more likely to contain corresponding fact-checking news and support direct fact verification. We obtained 6,483 posts from 2014 to 2018 after dropping the posts labeled as neutral and re-sampling.
News Environment: We use news headlines (plus short descriptions if any) from Huffington Post, NPR, and Daily Mail as the substitute of news tweets due to the Twitter's restriction (see sources in Appendix A). The bias rates of the three outlets are respectively left, center, and right according 5 https://newsverify.com/ to AllSides Media Bias Chart 6 , for enriching the diversity of news items. We preserved the news headlines from 2014 to 2018 and obtained a set of 1,003,646 news items.

Experimental Setup
Base Models Technically, our NEP could coordinate with any fake news detectors that produce post representation. Here we select four post-only methods and two "zoom-in" (knowledge-based) methods as our base models. 7 Post-Only: 1) Bi-LSTM (Graves and Schmidhuber, 2005) which is widely used to encode posts in existing works (Shu et al., 2019a;Karimi and Tang, 2019); 2) EANN T (Wang et al., 2018) which uses adversarial training to remove event-specific features obtained from TextCNN (Kim, 2014); 3) BERT (Devlin et al., 2019); 4) BERT-Emo (Zhang et al., 2021) which fuses a series of emotional features with BERT encoded features for classification (publisher emotion version). 8 "Zoom-in": 1) DeClarE (Popat et al., 2018) which considers both the post and retrieved documents as possible evidence; 2) MAC (Vo and Lee, 2021) which build a hierarchical multi-head atten-tion network for evidence-aware detection. Implementation Details We obtained the sentence representation from SimCSE (Gao et al., 2021) based on pretrained BERT models in the Transformers package (Wolf et al., 2020) 9 and were post-trained on collected news items. We frozed SimCSE when training NEP. For DeClarE and MAC, we prepared at most five articles in advance as evidence for each post by retrieving against fact-checking databases. 10 In environment modeling, T = 3, r = 0.1, and C = 22. We limit |E mac | ≥ 10. We implemented all methods using PyTorch (Paszke et al., 2019) with AdamW (Loshchilov and Hutter, 2019) as the optimizer. We reported test results w.r.t. the best validation epoch. Appendix B provides more implementation details. Evaluation Metrics. As the test sets are roughly balanced, we here report accuracy (Acc.), macro F1 score (macF1) and the F1 scores of fake and real class (F1 fake and F1 real ). We will use a new metric for skewed test data (see Section 5). Table 2 shows the performance of base models with and without the NEP on the two datasets. We have the following observations:

Performance Comparison (EQ1)
First, with the help of our NEP, all six base models see an performance improvement in terms of accuracy and macro F1. This validates the effectiveness and compatibility of NEP.
Second, for post-only methods, F1 fake generally benefits more than F1 real when using NEP, which indicates that news environments might be more helpful in highlighting the characteristics of fake news. This is a practical property of the NEP as we often focus more on the fake news class.
Third, the "zoom-in" knowledge-based methods outperform their corresponding post-only base model (here, Bi-LSTM) with the help of relevant articles, but the improvement is small. This might be led by the difficulty of finding valuable evidence. Our NEP brings additional gains, indicating that the information perceived from news environments is different from verified knowledge, and they play complementary roles. 9 bert-base-chinese and bert-base-uncased 10 We attempted to collect webpages using our posts as queries as Popat et al. (2018) did but rare ones could serve as evidence except fact-checking articles. As an alternative, we directly used articles from (Sheng et al., 2021a) for Chinese and collected~8k articles from a well-known fact-checking website Snopes.com for English.

Evaluation on Variants of NEP (EQ2)
Ablation Study. We have two ablative groups as shown in Table 3: w/o Fake News Detector: We directly use one of the two environment-perceived vectors or both to see whether they can work when not cooperating with the fake news detector's output o. The macro F1 scores on both datasets indicate their moderate effectiveness as sole inputs, and that coordinating with a post-only detector is a more practical setting.
w/o Environment Perception Modules: By respectively removing MACROENV and MICROENV from the best-performing models BERT-Emo+NEP and DeClarE+NEP, we see a performance drop in macro F1 when removing either of them, indicating that the two environments are both necessary and play complementary roles in detection. Effects of the proportion factor r for the MI-CROENV. We adjusted r from 0.05 to 0.30 with a step of 0.05 on BERT-Emo+NEP to see the impact of the scale of the MICROENV (T = 3). As Figure 4(a) shows, the change of r leads to an increase on the size of the MICROENV, but only fluctuations w.r.t. the accuracy. We do not see significant improvement after r = 0.1. We speculate that a too small r may hardly cover enough event-similar items while a large r may include much irrelevant information, bringing little gains (e.g., r = 0.3 in Chinese) or even lowering the performance (e.g., r = 0.15 for both datasets). Effects of the day difference T for the MACROENV. We set T = 1, 3, 5, 7, 9 on BERT-Emo+NEP to see how many days of news items to be considered is proper (T = 0 exactly corresponds to the base model). Figure 4(b) shows a tendency similar to (a). We find the highest accuracy when T = 3 on both of the two datasets. This is reasonable as the popularity should be considered in a  moderately short time interval to allow the events to develop but not to be forgotten.

Environment Analysis (EQ3)
Categorization of macro-and micro-preferred samples. We selected the top 1% of Chinese fake news samples which NEP relies more on MACROENV or MICROENV according to the gate vectors. Then we manually categorized these samples to probe what information the macro/micro environment might provide. From Figure 5, we see that MACROENV is more useful for samples about natural disasters and accidents (e.g., earthquakes and air crashes), while MICROENV works effectively in Society & Life (e.g., robbery and education). This is in line with our intuition: MACROENV-preferred fake news posts are often related to sensational events, so the popularity in MACROENV would help more; and MICROENVpreferred ones are often related to common events in daily news, and thus its novelty in MICROENV would be highlighted. This analysis would deepen our understanding on the applicability of different news environments.
Case study. Figure 6 shows three fake news cases in different scenarios. Case (a) relies more on MI-CROENV than MACROENV. We can see moderate popularity of its event about Huawei but the mes-sage about HarmonyOS is novel among the items on the 5G and cooperations. In contrast, the admit card in case (b) is moderately novel but Gaokao is the most popular event, so the NEP puts higher weight on MACROENV. Case (c) is a popular and novel fake news about Japan's great healthcare for citizens coming back from Wuhan which is posted during the first round of COVID-19 pandemic in China. The exploitation of both-side information makes a tie between the two environments. These cases intuitively show how NEP handles different scenarios. We incorporate further analysis on the case that the news environment might be ineffective in Appendix D.

Discussion in Practical Systems
Evaluation on skewed online data. We tested BERT-Emo and BERT-Emo+NEP on a dump of seven-month data from a Chinese fake news detection system. Different from offline datasets, this real-world set is highly skewed (30,977 real vs. 309 fake, roughly 100:1). 11 Under such skewed circumstance, some metrics we used in Tables 2  and 3 could hardly show the differences of performances among models (e.g., a model predicting all samples as real will have an incredible accuracy of 0.990). Here, we report macro F1 and standardized partial AUC with false positive rate of at most 0.1 (spAUC FPR≤0.1 , McClish, 1989, see Appendix C for the calculation detail) under different real/fake ratios (from 10:1 to 100:1). As shown in Figure 7

(c) Macro ≈ Micro
Wuhan pandemic is overwhelmingly popular.
Japan's ambulances is novel among the related events.
*Gaokao: National College Entrance Examination in China. Figure 6: Three fake news cases with different preferences on environmental information. Underlined regular words hit the keywords in the MACROENV and underlined italic words are related to the MICROENV. Keywords are extracted using TextRank (Mihalcea and Tarau, 2004). but also inherently friendly to practical systems: 1) Timeliness. Our NEP works instantly as it only requires the post and mainstream news published a few days before. In practice, a system would not construct the required collection on demand but prepare it ahead by maintaining a queue of news items. 2) Compatibility. Our perception module can be integrated with existing methods, which we validated on six representative ones (Table 2). 3) Data Accessibility. The data to construct news environments is easy to access, especially compared with obtaining credible knowledge sources. The advantages may encourage the deployment of NEP into practical systems.

Conclusion and Future Work
We proposed the NEP to observe news environments for fake news detection on social media. We designed popularity-and novelty-oriented perception modules to assist fake news detectors. Experiments on offline and online data show the effec-tiveness of NEP in boosting the performance of existing models. We drew insights on how NEP help to interpret the contribution of macro and micro environment in fake news detection. As this is the first work on the role of news environments for fake news detection, we believe further exploration is required for a deeper understanding of the effects of news environments and beyond. In the future, we plan to explore: 1) including historical news or background to handle posts weakly related to the present environment; 2) modeling post-environment relationships with diverse similarity metrics or even from other perspectives; 3) investigating the effects of different news environments (e.g., biased vs. neutral ones) to make the environment construction more principled; 4) extending this type of methodology from the text-only detection to multi-modal and social graph-based detection. no need to wait for the accumulation of user responses or query to knowledge sources. Due to the requirement of real-time access to open news sources (source list can be determined as needed), it might be easier to deploy for service providers (e.g., news platforms) and media outlets. Data. Our data is mostly based on existing datasets, except the news items for constructing news environments. All news items (or headlines) are open and accessible to readers and have no issues with user privacy. The media outlets in the English dataset might be considered "biased", so we carefully select a left, a center, and a right outlet (whose headlines are available) according to the AllSides Media Bias Chart. In China, a media outlet might be state-run (e.g., CCTV News), local-governmentrun (e.g., The Paper), or business-run (e.g., Toutiao News). With no widely recognized bias chart of Chinese media as a reference, we select media outlets based on their influence (e.g., number of followers) on Weibo from the three categories for the sake of representativeness.