Conflicts, Villains, Resolutions: Towards models of Narrative Media Framing

Despite increasing interest in the automatic detection of media frames in NLP, the problem is typically simplified as single-label classification and adopts a topic-like view on frames, evading modelling the broader document-level narrative. In this work, we revisit a widely used conceptualization of framing from the communication sciences which explicitly captures elements of narratives, including conflict and its resolution, and integrate it with the narrative framing of key entities in the story as heroes, victims or villains. We adapt an effective annotation paradigm that breaks a complex annotation task into a series of simpler binary questions, and present an annotated data set of English news articles, and a case study on the framing of climate change in articles from news outlets across the political spectrum. Finally, we explore automatic multi-label prediction of our frames with supervised and semi-supervised approaches, and present a novel retrieval-based method which is both effective and transparent in its predictions. We conclude with a discussion of opportunities and challenges for future work on document-level models of narrative framing.


Introduction
Media discourse around contested issues is often biased by experiences or interests of the news outlets and different stakeholders they give voice to.News framing by the media has been formalized and examined on many levels in communication and social sciences, ranging from selection of information (Levitt, 1981) over discoursecentric (Pan and Kosicki, 1993) and entity focused approaches (Lawlor and Tolley, 2017).While a growing body of work in NLP attempts to automatically detect framing in the news or social media, most work adopts well-defined yet oversimplifying approaches like topic modeling or simple classifiers (see Ali and Hassan (2022) for a recent review); formalize the task as single-label classification ignoring co-existence and interactions of different frames, and focus on localized emphasis frames rather than the full story (Card et al., 2015;Field et al., 2018;Khanehzar et al., 2021).
This paper addresses the above shortcomings by considering framing through the lens of narratives.We adopt a small set of high-level framing devices established in the communication literature (Neuman et al., 1992;Semetko and Valkenburg, 2000), and integrate them with narrative roles assigned to key actors (entities) in the discourse.Table 1 defines the frames and associated entity roles.We argue that more nuanced and transparent automated models of framing are essential to meaningfully support social studies into a systematic understanding of the viewpoints presented by different stakeholders in contested issues such as climate change.Our contributions to this end are (1) introducing an established frame inventory and annotation procedure from the communication sciences into NLP; (2) a labeled data set; (3) a case study on the framing of climate change, showcasing the potential of our annotations for large-scale media analysis; and (4) experiments on automatic frame prediction, including an effective and transparent retrieval-based classifier to predict multiple frames per article.
Table 1 (top) summarizes the frames used in our work.Our framework departs from existing NLP approaches in two ways: first, we adopt a multilabel classification paradigm, allowing for a more nuanced analysis and avoiding the oversimplification of framing to a single label per item.Secondly, our framework emphasizes the narrative structure of the full article: frames such as conflict, resolution or human interest are central building blocks of narratives, and are dominant in news coverage of contested issues (Semetko and Valkenburg, 2000).Building on components of the Narrative Policy Framework (Shanahan et al., 2018), we identify the key entities responsible for the issue (villains), those who are affected (victims) and those who can resolve the issue (hero); see Table 1 (bottom).
We apply our framework to the issue of climate change, a pressing global challenge with widereaching impacts (Pew Research Center, 2022), which remains politically contested in terms of the understanding of its urgency, causes, and possible solutions (Sparkman et al., 2022).Importantly, studies show that climate 'skeptics' (and those lacking scientific backing) are cited almost twice as often in the mainstream news media as those calling for climate action (Wetts, 2020), rendering the examination of media framing and its effects on public support for climate change mitigation a pressing goal.While a substantive body of work on climate change framing emerged in the social and communication sciences (Nisbet, 2009;Wolters et al., 2022), the issue has attracted surprisingly little attention in the NLP community to date.Exceptions include work on stance detection in news (Luo et al., 2020) or social media (Vaid et al., 2022) or models of scepticism detection (Bhatia et al., 2021), whereas we focus on the narrative framing of the issue across political leanings.To recap, in this paper we present: • The concept of "Narrative Media Framing" formalized through a set of frames about conflicts, their effects and resolutions which are integrated with narrative roles assigned to key actors (Section 3).
• The narrative frames corpus of 428 English news articles on climate change labeled with frame devices (Table 1).Following Semetko and Valkenburg (2000), annotators answered binary indicator questions, and the final frame labels were derived from the answer set (Section 3).
• A detailed analysis of our annotated data set, highlighting the interaction of frames and narrative roles, and differences across media outlets with different political bias (Section 4).
• Experiments on automatic frame prediction, including semi-supervised and supervised methods, including a new simple and transparent, yet effective method which combines retrieval with classification (Section 5).

Background
Media framing refers to the deliberate presentation of information in order to elicit a desired response or shift in reader's attitude.We introduce into NLP five high-level frames (Table 1 top), identified by Semetko and Valkenburg (2000) as covering the dominant framing in reporting on contested issues with the aim to attract reader's attention (Mendelsohn et al., 2021).These categories have been applied via manual content analysis to a variety of issues and events, ranging from the media coverage of the Egyptian revolution (Fornaciari, 2012), over the MH370 crash (Bier et al., 2018) to climate change (Dotson et al., 2012;Feldman et al., 2017).
To identify each frame, Semetko and Valkenburg (2000) proposed a set of binary indicator questions which improved annotation quality and portability of the framework across studies.We construct the first publicly available data set annotated with this framework, cover a larger and more diverse set of news articles than prior work, and link the frames with narrative roles assigned to key entities appearing in the story.Media framing may manifest through the narrative roles -Hero, Villain or Victim -assigned to key entities in a document, and this phenomenon has been widely studied in the communication sciences in general (Shanahan et al., 2018) and in the context of climate change in particular (Lück et al., 2018).We draw on this work as well as work which identified key stakeholder categories in the climate change discourse (Haigh and Griffiths, 2009;Ahchong and Dodds, 2012;Chen et al., 2022) to analyze the framing of entities in news articles along the political spectrum.
NLP studies on framing have predominantly focused emphasis framing, the strategic inclusion or omission of aspects of an issue, such as legality or public opinion (Card et al., 2015) or, to a lesser extent, on equivalence framing as different expressions of identical concepts ("alien" vs "immigrant", Lee et al. (2022);Ziems et al. (2022a)).Both perspectives focus on local, lexical signals.Emphasis framing is typically formalized as a single-label prediction of the most dominant frame in a news article (Card et al., 2015), headline (Liu et al., 2019;Akyürek et al., 2020), or a social media post (Johnson et al., 2017;Hartmann et al., 2019).The media frames corpus (Card et al., 2015) is one of the most comprehensive frame-labeled data sets comprising several thousand news articles across five contested issues.While the data includes spanlevel labels which could be used for multi-label classification, work using the MFC predominantly attempts document-level prediction of a single "primary" article frame, disregarding span labels (Ji and Smith, 2017;Khanehzar et al., 2021). 2 More broadly, our work complements emphasis frames by considering more abstract frames around conflict, resolution, and personal, moral and economic impacts.Both formalizations of framing have a strong foundation in the communication literature, and studying their interaction at scale with NLP methodology is an interesting avenue for future work.Mendelsohn et al. (2021) consider a variety of framing strategies in the context of tweets, but with less of a focus on story structure due to the short document lengths.Our framework complements their work in three ways: i) we approach framing as multi-label classification, relaxing the assumption of a single frame per article; ii) we present a set of frames that are abstract with evidence distributed across a document, requiring higher-level document model; and iii) we link frames with narratives via entity roles in a unified annotation framework consisting of a series of binary indicator questions, allowing us to study the interplay of framing and narratives.

The Narrative Frames Corpus
We identified 17.9K English-language news articles on climate change published in 2017-2019 in the UK and in the US by matching a set of climate change-specific keywords in articles from the NELA corpora (Horne et al., 2018;Nørregaard et al., 2019;Gruppi et al., 2020).See Appendix A for more details.For each article, NELA provides 2 Although see Field et al. (2018) for an exception.metadata about its publication date, media outlet, and its associated political leaning as identified by the Media Bias Fact Check (MBFC) website. 3We manually annotated a subset of 428 articles of this data set, balanced across the three years and the four most dominant MBFC categories: left, centerleft, right and questionable source. 4e recruited four on-site annotators, all English native speakers with a background in the social/political sciences.The annotators went through an extensive training phase including several rounds of feedback.Details of annotator remuneration can be found in the Ethics statement.
Frame annotations We adapted Semetko and Valkenburg (2000)'s frame indicator questions.We added a pre-screening question to confirm that an article is predominantly (>70%) about climate change, removed one question about visual information (as we focus on text only), and changed wording specific to the 'government' to 'any entity' to align with our broad definition of stakeholder entities, discussed below.The full questionnaire is shown in Appendix B. Annotators were presented with the full article text together with the questionnaire, but no explicit meta information such as outlet name or date of publication.
The raw annotations provided answers to a list of binary indicator questions.We verified that the mapping (i.e., the factor structure) between the five frames in Table 1 and their associated indicator questions in Semetko and Valkenburg (2000) replicates in our annotated data set.To do so we ran a confirmatory factor analysis (CFA; Brown and Moore (2012)). 5We removed all items with a factor loading < 0.3 (as not fitting well into any of the five factors), retaining a total of 13 indicators, with 2-3 indicators loading on a given frame.These questions are listed in Table 2.The final model fitted the data well (CFI=.945;RMSEA=.052[.039,.065],p=.370; SRMR=.059),confirming the fivefactor structure.An article was then labeled with a frame if ≥ 2 indicator questions for that frame were answered 'yes' by ≥ 2 annotators. 6This resulted RE (1) Does the story suggest a solution(s) to the issue/problem?; (2) Does the story suggest that some entity could alleviate the problem?

CO
(1) Does the story reflect disagreement between political parties/individuals/groups/countries?; (2) Does one party/individual/group/country reproach another?; (3) Does the story refer to two sides or more than two sides of the problem or issue?

HI
(1) Does the story provide a human example or a "human face" on the problem/issue?; (2) Does the story employ adjectives or personal vignettes that generate feelings of outrage, empathy-caring, sympathy, or compassion?; (3) Does the story go into the private or personal lives of the entities involved?

MO
(1) Does the story contain any moral message?; (2) Does the story make reference to morality, God, and other religious tenets?

EC
(1) Is there a mention of financial losses or gains now or in the future?; (2) Is there a mention of the costs/degree of the expense involved?; (3) Is there a reference to the economic consequences of pursuing (or not) a course of action? in a multi-label data set with articles covering zero (12%), one (39%), two (32%) three (15%) or four (3%) frames.See Appendix F for additional data set statistics.
Entity annotations Entities were annotated as part of the binary indicator questionnaire introduced above.Three indicator questions assessed whether an article contained an entity that could alleviate the problem (Hero); was responsible for the problem (Villain) or was negatively affected by the issue (Victim).If an annotator answered 'yes' to any of these questions, they were asked to identify the most appropriate entity in the text.An entity was meant to be selected only if the article was explicit about that entity's role (e.g., a politician was depicted as "the only person who could save the planet") and strictly based on the entity's presentation in the article, rather than the annotator's opinion about that entity. 7We included all entities extracted by our annotators as part of our published data set.
Annotator agreement Krippendorff's α across four annotators and 13 frame indicator questions is 0.52, indicating fair agreement as expected for a complex task like frame annotation.Average pairwise agreement without chance-correction is 0.78 (min=0.75,max=0.81).A total of 2,185 entities were extracted across all narrative roles.Average pairwise agreement on existence of a role in an article was 0.59 (Krippendorff's α = 0.40).To assess agreement on the identhe label.
7 E.g., if an article presented Trump as a person who could mitigate climate change, the annotator was supposed to tag him as a Hero, even if they didn't personally agree with that interpretation.
tity of entities for roles which were attested by at least two annotators, we computed the exact string match of associated entities, after basic text normalization.Entities match exactly 41% of the time.We also computed more lenient metrics based on token overlap (average Rouge-L=0.45)and embedding similarity (average.BertScore=0.91)between pairs of extracted entities.
The agreement for both role detection and entityrole assignment was low overall, suggesting that the task is challenging.In this paper, we use the narrative role labels in the exploratory analysis in Section 4 and discuss future work on computational modeling of narrative roles in Section 6.

Stakeholder categories
We grouped the >2K extracted entities into a smaller set of stakeholder categories to ease analysis.We identified 10 such categories from the previous literature (Ahchong and Dodds, 2012;Blair and McCormack, 2016;Chen et al., 2022;Haigh and Griffiths, 2009), adopting a broad definition of stakeholders which includes groups or entities that 'affect or are affected by' the issue of climate change (Freeman, 1984).The set of stakeholder categories is shown in Figure 1b, and Appendix E provides additional details.One annotator assigned each unique extracted entity to its most appropriate stakeholder category (or a generic category 'Other' if no other category fit).
In sum, the narrative frames corpus consists of 428 English news articles labeled with (1) multi-label frame categories; (2) narrative roles for specific entities; and (3) their associated stakeholder category; as well as meta-data about the article's date of origin, outlet, and associated political leaning.

Narrative Framing of Climate Change
We conduct an exploratory analysis on the framing of climate change in media outlets with different political leaning, as well as the interplay of frames, narrative roles and stakeholder categories.
Framing and political leaning Figure 2 shows the proportion of articles mentioning each frame by the media outlets' political leaning.8Conflict (CO) and Resolution (RE) are most prevalent across all leanings.The Moral frame (MO) is least prevalent throughout.This pattern is partially consistent with previous research.Dirikx and Gelders (2010) found Resolution, but not Conflict, to dominate in climate change reporting in the Netherlands and France in early 2000s, which might suggest that the discourse on climate change has become more polarized over time, in particular in our data set of US and UK news coverage where the media landscape is strongly partisan.For example, in a more recent study involving four major US newspapers, Kim and Wanta (2018) show that Conflict is the most common frame in the context of US immigration.
Resolution (RE) is more prevalent in the leftleaning outlets (left, left_center), while the opposite is true for Human Interest (HI): right-leaning (and questionable) outlets are more likely to refer to personal stories and use language evoking empathy.These findings are partially consistent with prior work, e.g, Feldman et al. (2017) show that both Economic and Conflict are more likely to be used in conservative outlets, while we find Conflict prevalent across the board.However, Feldman et al. (2017) only included three major US news papers, in contrast to 41 in our analysis.
Frames, roles and stakeholders Figure 1 illustrates the association of narrative roles with different frames (1a) and stakeholders (1b).Unsurprisingly, the Hero, an entity presented with the ability to fix or alleviate the issue under discussion, is the most prevalent role in the Resolution frame.The Villain dominates most other frames, except for the Human Interest frame where Victim is equally dominant.This aligns with a well-known "negativity bias" in news reporting, i.e., a dominance of negative content with a focus on problems, conflicts and their causes and victims (Soroka et al., 2019).
We explore the distribution of roles across stakeholder categories in Figure 1b.Overall, Governments & Politicians are the most dominant stakeholder category, typically depicted as the Villain (of all stakeholder categories they are also most likely to be depicted as the Hero pointing to ambivalent attitudes toward this category).The Environment and the General Public dominate the Victim role, somewhat unexpectedly followed by Governments & Politicians and Industry & Emissions.We explain this phenomenon next by disentangling the labels by political leaning.
Figure 3 reveals how the framing of a particular stakeholder category can vary with political leaning of the source.Right-leaning media are more likely to depict Environmental Activists & Organisations, and Legislation as the Villain and Industry and Emissions as either the Hero or the Victim in the context of climate change news.Conversely, left-leaning media are more likely to frame Legislation as a Hero, cover Environmental Activists less frequently overall, and predominantly frame the Industry as a Villain.

Narrative Frame Prediction
Predicting narrative frames automatically and with high quality would open new possibilities for scaling media framing analyses to larger data sets, longer time spans or more languages.Given the political sensitivity of automated media analysis, models should not only be reliable but also transparent in their predictions.To this end, we present Retrieval-Based Frame prediction (RBF), which incorporates an embedding-based retrieval module into supervised classifiers.We compare RBF against a range of neural classifiers on multi-label frame prediction.RBF not only outperforms offthe-shelf fine-tuned transformers on this task, but also increases interpretability by predicting frames for a given article together with the most relevant article sentences for the frame as evidence.Section 6 discusses additional modelling tasks supported by our data set to be addressed in future work.

RBF: Retrieval-Based Frame Prediction
We propose a simple method, retrieval-based frame prediction (RBF), which combines pre-trained language model embeddings with a retrieval objective.Similar approaches have been previously pro-   posed in the context of word-sense disambigutaion and semantic frame predictions (Jiang and Riloff, 2021;Blevins and Zettlemoyer, 2020).We embed (i) short frame descriptions f 1 . . .f C9 and (ii) sentences from an input news article s 1 . . .s N in a joint space, and retrieve sentences most proximate to the frame embedding: where h s i and h f j are the embeddings of sentence s i and frame f j , respectively, the relevance rel of s i to f j corresponds to their cosine similarity.We use SentenceBert (Reimers and Gurevych, 2019) as our embedding method emb.Given an article, we obtain J frame-specific relevance-rankings of all sentences in the input article.
We then train a linear classifier to predict the presence or absence of a frame in an article based on the most relevant sentences by our measure above.We include five input channels: channels (1)-(3) are the three sentences most relevant for a frame according to RBF relevance; channel (4) includes all sentences exceeding relevance threshold θ > 0.15, except for sentences (1)-(3), concatenated with a [SEP] token;10 and (5) contains the news article truncated at 256 tokens.Each channel is encoded with the Longformer (Beltagy et al., 2020), final hidden state embeddings are concatenated and passed into the classifier.Longformer parameters are fine-tuned during the training process.We evaluate the importance of different channels by ablating the impact of the full article channel (5) (RBF -a) and additionally the threshold sentence channel (4) (RBF -a,t).
RBF combines two desiderata: First, it identifies multiple sentences relevant to a target frame, capturing key evidence that may be distributed across the article rather than locally.Second, RBF's frame-based sentence retrieval backbone can be  and RBF (middle), as well as an ablation of RBF channels (bottom).We report macro-averaged precision and recall across the five labels with standard deviation (in brackets), and their harmonic mean (F1).
interpreted as an 'explicit attention mechanism', customized to the frame label to be predicted, with sentences serving as evidence.We evaluate the retrieved sentences in terms of their interpretability in Section 5.3.1.

Experimental Setup
Given the small size of the Narrative Frames Corpus, we adopt the simplest formalization of multilabel classification: for each model class (row in  (Miao et al., 2020;Berthelot et al., 2019), a method for semi-supervised fine-tuning of pre-trained language models which was originally proposed in computer vision, but recently adapted to semi-supervised opinion mining (Miao et al., 2020).Snippext fine-tunes BERT using an interpolation of a small amount of gold-labeled data, and a much larger set of unlabeled data with predicted, soft labels,12 drawing on the MixMatch strategy recently proposed in computer vision (Berthelot et al., 2019).We augment our small labeled training data set with the ≈17.5K unlabelled climate-related articles (cf., Section 3).The input to all transformer models is truncated to 256 tokens. 13Detailed training settings and model parameters are provided in Appendix G.
Metrics We evaluate models on correctly predicting the presence of frames in articles.We report macro-averaged precision and recall over the five frames, assigning equal importance to each frame label; as well as their harmonic mean (F1 score).

Main results
Table 3 shows the frame prediction results.All models significantly outperform the random and majority baselines.All neural methods perform better than the non-neural KNN.BERT performs worse than the Longformer, presumably due to the Longformer's higher capacity with 1.5× the parameters of BERT.RBF is best overall, suggesting that combining Longformer embeddings with a relevance based sentence retrieval backbone helps the models to focus on frame-relevant context.We ablate the impact of the different channels in RBF in Table 3 (bottom).The model performance drops with the removal of each input channels, suggesting that the input channels are complementary and each contributes to the performance.Snippext and RBF perform comparably, with inverse emphasis on precision and recall, however, only RBF offers explicit evidence for prediction (which we explore in the next section).A semi-supervised extension of RBF is a promising avenue for future work.
Given the multi-label nature of our data set, a natural question is how often models predict all and only the annotated frames for an article (exact match).RBF does so 18% of the time.Appendix H provides more detailed results and analyses of perframe and per-label performance.

Qualitative analysis
For each frame, we inspect sentences retrieved as highly relevant by RBF.Table 4 displays these sentences.We boldfaced the most relevant phrases for ease of exposition.The selected sentences align closely with the definition of each frame: for Human Interest, they refer to the struggle of affected individuals and evoke empathy; for Moral they refer to god, religion and moral values; and for Resolution they mention explicit solutions.One intriguing direction for future work will be to study the differences in manifestation of different frames across outlets from different sides along the political spectrum.

Discussion
What are the recurring narratives that frame the public discourse about contested issues like climate change?Existing NLP approaches to frame prediction fall short of answering this question due to a focus on localized signals.Drawing on theories from the social and communicative sciences, we introduced a set of narrative framing devices to NLP, and integrated them with narrative roles assigned to the central entities in the news articles.
We applied our framework to the issue of climate change, and annotated >400 English-language news articles from major outlets with different political leanings with multi-label frames and narrative roles of entities and their stakeholder categories.Our exploratory analysis demonstrated how our framework can be utilized to study multiple levels of framing, including differences across outlets; co-occurrence of frames and narrative roles; and assignment of narrative roles to stakeholder categories.
With the ultimate goal of scaling such analyses to larger, unlabeled data sets, we introduced RBF, an effective and interpretable retrieval-based frame classifier.The 'explicit attention' module of RBF not only improved performance over its backbone, the vanilla Longformer, but also naturally provides evidence for its predictions as a list of relevanceranked article sentences.
Our work addresses a disconnect between the complexity of framing acknowledged in the communication science literature, and models of framing in NLP.As recently surveyed by Ali and Hassan (2022), NLP approaches to framing predominantly focus on topic models or frequency-based methods, leaning heavily on local lexical signals as indicators for the presence or absence of a single frame per unit of analysis.However, the framing of a news article typically emerges from indicators spread throughout the text; frames can co-exist and interact with each other within a single news story.This paper takes one step towards such an integrated notion of framing in NLP in considering narrative frames and roles at the article level and adopting multi-label task formalization.
Our work and results suggest many avenues for future research.The Narrative Frames Corpus supports research on joint models of framing and entity roles: the presence of an entity with a specific role (e.g., the Hero) should render the presence of certain frames (e.g., Resolution) more likely.Conversely, frames like Conflict impact the probability of the existence of the number and kind of different roles (e.g., the Hero and the Villain).A joint model of frames and narrative roles could incorporate role labels with soft confidence weights RE (1) However the study finds that no single solution will avert the dangers, so a combined approach is needed.(2) The key element is that these three solutions must be implemented together."(3) We also looked at increasing the efficiency of water use, and we looked at better monitoring and recycling of fertiliser -lots of it is lost and it runs off into rivers and causes dead zones in the oceans."

CO
(1) Their answers -and reactions to them -foreshadowed the fight ahead with conservatives and industry regardless of who becomes the next president.(2) Democrats vying for president revealed a fundamental split over how aggressively the US should tackle climate change [. . .] in a seven-hour town hall meeting on Wednesday.
(3) [. . .] held after the Democratic National Committee refused to sanction an official climate debate between candidates and amid unprecedented pressure from young activists and the Democratic voting base to tackle the climate crisis.

HI
(1) To Janet, this is a moral issue.
(2) These were matters that we have historically agreed on, if for no other reason than the sake of our children and grandchildren's future.
(3) And in Jordan's case, that would be Social Security, Medicare, education, health care and the like: Programs that benefit folks in his district.
MO (1) It is, after all, the measure of one's moral fitness to value some things (say, forgiveness) over others (vengeance).
(2) When organized religion fades and its would-be adherents are left to search for meaning, does the god of the environment end their search for a moral authority?(3) That statement not only describes Judas's moral disorder but also reminds the audience that any concern, holy as it may be -poverty reduction, environmental protection, or any other |earthly mission -that does not give a preferential deference to God, His creation, and acts of beauty such as that of Mary Magdalene are sure signs of misaligned priorities.

EC
(1) Both can impact the relative financial attractiveness of future energy options.
(2) "Even Milton Friedman understood the existence of market externalities, the fact that damage to our environment is not accounted for in the free market without placing some sort of price signal."(3) They end up helping certain wealthy people to the disadvantage of the less fortunate." Table 4: Three top relevant sentences (right) extracted by RBF for articles which were correctly predicted as containing the frame (left), in order of decreasing relevance.Highlights of relevant phrases manually added in bold.
as latent signal into a frame classification model.
Annotator (dis)agreement and aggregation of answers to indicator questions into frame tags provide fertile grounds for future work.We echo a line of recent work on acknowledging label variation as a signal of genuine complexity rather than noise.This holds true particularly for complex tasks like frame annotation which inevitably retain a level of subjective variation (Pavlick and Kwiatkowski, 2019;Plank, 2022).In this paper we aggregated indicator labels into a hard frame label by voting, however we release the raw annotations as part of our data set.Future work could explore soft aggregation methods, delineate genuine variation from noise, and adopt disagreement-aware models and evaluation metrics.
The comparatively small size of the Narrative Frames Corpus and the competitive semisupervised Snippext suggest further exploration of semi-supervised approaches.Integrating RBF with a Snippext-inspired semi-supervised framework, most simply by soft labeling articles as a function of their retrieved sentences and RBF relevance scores, would allow to leverage large unlabeled data sets while retaining RBF's interpretability.Alternatively, one could adapt models from different domains, for instance by drawing on the literature of modeling narrative roles in folk tales (Valls-Vargas et al., 2014;Jahan et al., 2021).

Limitations
We acknowledge a range of limitations of our work.
As discussed in Sections 3 and 6, overall annotator agreement ranged from fair (frame annotations) to low (entity role annotations).We do not view this as a limitation per se, again pointing to the recent literature on the value of human label variance pointing at a potential loss of valuable information if we overly focus on arriving at a single gold label per instance, with high confidence (Plank, 2022;Pavlick and Kwiatkowski, 2019).Future modeling work involving entity labels should, however, carefully inspect the role label variation, and potentially remove or aggregate selected annotations, before incorporating the labels as signal into predictive models.We explicitly refrained from training model in this paper to avoid the risk of training a predictor on an unfavorable noise-to-signal ratio.
Our data set focuses on English-language news reports, sampled from 2017 to 2019 in mainstream media outlets in the US and UK, and as such focuses on cultures and communities which are already well-resourced and well studied.With climate change being a global challenge, broadening data sets, annotations and models to more languages is an important direction for future work.
We explicitly caution against projecting annotations across languages without careful validation as we expect the manifestation of framing, views on entities (or sheer set of dominant entities) to vary widely across countries and communities.
Even within our English study, we acknowledge that the size of annotated data is small for NLP scales, and an extension in the future is desirable.A related current limitation is the focus on just a single issue (climate change) and validation of our narrative framing framework for other issues is an important direction for the future.Finally, the annotation process was slow and costly, relying on trained, highly educated annotators with constant monitoring, rendering larger scale annotations challenging, on the one hand.On the other hand, we will release upon acceptance our annotation procedure including the full codebook with instructions, which have been optimized over several rounds of annotations and we hope can support more efficient annotation in the future.

Ethics statement
This study was approved by the University of Melbourne ethics board (Human Ethics Committee LNR 3B), Reference Number 2023-22109-37029-4, and data acquisition and analysis has been taken out to the according ethical standards.We hired four local annotators who were paid an hourly rate of $53 AU in line with the casual research assistant hourly rates set up in the University of Melbourne collective agreement.
We will release the Narrative Framing Corpus comprising of 428 news articles annotated with frame labels, entities, their narrative roles and stakeholder categories.We also publish the raw (nonaggregated) annotations.Our data set builds on news articles from the NELA corpora 2017-2019, which were released to the public domain (license CC0 1.0). 14We release our code and Narrative Frames Corpus under a MIT license.Select 'yes' if the story explicitly refers to religious tenets or moral obligations framed through the lens of obligations to a spiritual community.Select 'yes' also if the mention is indirect e.g., through a quote or a metaphor.MO3 19.Does the story offer specific social prescriptions about how to behave?
Select 'yes' if the story explicitly mentions expectations around norms of conduct, limitations or prohibitions on actions or events.

EC1
(0.28) 20.Is there a mention of financial losses or gains now or in the future?
Select 'yes' if the story explicitly refers to the financial impacts of the issue.

EC2
(0.37) 21.Is there a mention of the costs/degree of the expense involved?
Select 'yes' if the story explicitly refers to the amount of loss or gain (e.g., "$100,000", "enormous cost").

EC3
(0.25) 22. Is there a reference to the economic consequences of pursuing or not pursuing a course of action?
Select 'yes' if the story explicitly mentions the impacts of action or inaction on the economy.

D Media Outlets
Table 5 lists all media outlets in the labeled data set, together with number of articles and MBFC political leaning.

F Label statistics
Figure 5 shows label distributions in our data set.In terms of prevalence of our five frames individually, Co-occurrence (count)  of roles with frames.
Co-occurrence (count)  of roles with stakeholder groups.

Figure 1 :Figure 2 :
Figure 1: Association of roles with different frames (top) and stakeholder groups (bottom).

Figure 3 :
Figure 3: Narrative roles assigned to stakeholder categories in news outlets with different political leaning.

Figure 4
Figure4shows our annotation interface split into an answer form (left) and an article display with (optional) highlighting of prevalent entities (right, in color).

Figure 4 :
Figure 4: Our annotation interface.Left: Excerpt of the annotation form which covers the 22 binary indicator questions, and free-text fields to record role-specific entities.Right: the news article with prevalent entities highlighted (based on automatic entity recognition and co-reference resolution.)

Table 1 :
Five frames (top) and three narrative roles (bottom) considered in this paper.

Table 3 :
Frame prediction results of baselines (top) representative supervised and semi-supervised methods,

Table 3
(Beltagy et al., 2020)e KNN classifier per frame; 4. BERT-medium(Devlin et al., 2019)fine-tuned for binary frame prediction; 5. Longformer-base-4096(Beltagy et al., 2020)fine-tuned for binary frame prediction; and 6. an adaptation of the Snippext model Process: Applications of the Narrative Policy Framework, chapter 9, pages 218-239.MT: Montana State University Library.If your answer is "yes", please select the most positively affected entity.Select 'yes' if the story explicitly refers to how one or more entity/ies benefit from the problem/issue.Mark 'yes' if the story explicitly refers to the personal life of at least one entity.Select 'yes' if the story explicitly refers to the active conflict between two or more entities -past or present.Select 'yes' if the story explicitly mentions at least two viewpoints on the current issue (even if they're not presented in a balanced, objective manner).CO4 16.Does the story refer to winners and losers?If your answer is "yes", please select the most appropriate winner/loser entity.Select 'yes' if the story explicitly refers to one or more 'winners' and/or 'losers' which emerged from an active conflict/argument/war.Note, in some stories an entity can be both a winner and a loser.
Table 6 lists our set of entity groups, together with some representative examples and the total number (tokens) of instances assigned to each group.