@inproceedings{spangher-etal-2024-tracking,
title = "Tracking the Newsworthiness of Public Documents",
author = "Spangher, Alexander and
Tumgoren, Serdar and
Welsh, Ben and
Peng, Nanyun and
Ferrara, Emilio and
May, Jonathan",
editor = "Ku, Lun-Wei and
Martins, Andre and
Srikumar, Vivek",
booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = aug,
year = "2024",
address = "Bangkok, Thailand",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.luhme-long.763/",
doi = "10.18653/v1/2024.acl-long.763",
pages = "14150--14168",
abstract = "Journalists regularly make decisions on whether or not to report stories, based on {\textquotedblleft}news values{\textquotedblright}. In this work, we wish to explicitly model these decisions to explore {\_}when{\_} and {\_}why{\_} certain stories get press attention. This is challenging because very few labelled links between source documents and news articles exist and language use between corpora is very different. We address this problem by implementing a novel {\_}probabilistic relational modeling{\_} framework, which we show is a low-annotation linking methodology that outperforms other, more state-of-the-art retrieval-based baselines. Next, we define a new task: {\_}{\_}newsworthiness prediction{\_}{\_}, to predict if a policy item will get covered. We focus on news coverage of local public policy in the San Francisco Bay Area by the {\_}San Francisco Chronicle{\_}. We gather 15k policies discussed across 10 years of public policy meetings, and transcribe over 3,200 hours of public discussion. In general, we find limited impact of public discussion on newsworthiness prediction accuracy, suggesting that some of the most important stories barely get discussed in public.Finally, we show that newsworthiness predictions can be a useful assistive tool for journalists seeking to keep abreast of local government. We perform human evaluation with expert journalists and show our systems identify policies they consider newsworthy with 68{\%} F1 and our coverage recommendations are helpful with an 84{\%} win-rate against baseline. We release all code and data to our work here: https://github.com/alex2awesome/newsworthiness-public."
}
<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="spangher-etal-2024-tracking">
<titleInfo>
<title>Tracking the Newsworthiness of Public Documents</title>
</titleInfo>
<name type="personal">
<namePart type="given">Alexander</namePart>
<namePart type="family">Spangher</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Serdar</namePart>
<namePart type="family">Tumgoren</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Ben</namePart>
<namePart type="family">Welsh</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Nanyun</namePart>
<namePart type="family">Peng</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Emilio</namePart>
<namePart type="family">Ferrara</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Jonathan</namePart>
<namePart type="family">May</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<originInfo>
<dateIssued>2024-08</dateIssued>
</originInfo>
<typeOfResource>text</typeOfResource>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</title>
</titleInfo>
<name type="personal">
<namePart type="given">Lun-Wei</namePart>
<namePart type="family">Ku</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Andre</namePart>
<namePart type="family">Martins</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Vivek</namePart>
<namePart type="family">Srikumar</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<originInfo>
<publisher>Association for Computational Linguistics</publisher>
<place>
<placeTerm type="text">Bangkok, Thailand</placeTerm>
</place>
</originInfo>
<genre authority="marcgt">conference publication</genre>
</relatedItem>
<abstract>Journalists regularly make decisions on whether or not to report stories, based on “news values”. In this work, we wish to explicitly model these decisions to explore _when_ and _why_ certain stories get press attention. This is challenging because very few labelled links between source documents and news articles exist and language use between corpora is very different. We address this problem by implementing a novel _probabilistic relational modeling_ framework, which we show is a low-annotation linking methodology that outperforms other, more state-of-the-art retrieval-based baselines. Next, we define a new task: __newsworthiness prediction__, to predict if a policy item will get covered. We focus on news coverage of local public policy in the San Francisco Bay Area by the _San Francisco Chronicle_. We gather 15k policies discussed across 10 years of public policy meetings, and transcribe over 3,200 hours of public discussion. In general, we find limited impact of public discussion on newsworthiness prediction accuracy, suggesting that some of the most important stories barely get discussed in public.Finally, we show that newsworthiness predictions can be a useful assistive tool for journalists seeking to keep abreast of local government. We perform human evaluation with expert journalists and show our systems identify policies they consider newsworthy with 68% F1 and our coverage recommendations are helpful with an 84% win-rate against baseline. We release all code and data to our work here: https://github.com/alex2awesome/newsworthiness-public.</abstract>
<identifier type="citekey">spangher-etal-2024-tracking</identifier>
<identifier type="doi">10.18653/v1/2024.acl-long.763</identifier>
<location>
<url>https://aclanthology.org/2024.luhme-long.763/</url>
</location>
<part>
<date>2024-08</date>
<extent unit="page">
<start>14150</start>
<end>14168</end>
</extent>
</part>
</mods>
</modsCollection>
%0 Conference Proceedings
%T Tracking the Newsworthiness of Public Documents
%A Spangher, Alexander
%A Tumgoren, Serdar
%A Welsh, Ben
%A Peng, Nanyun
%A Ferrara, Emilio
%A May, Jonathan
%Y Ku, Lun-Wei
%Y Martins, Andre
%Y Srikumar, Vivek
%S Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
%D 2024
%8 August
%I Association for Computational Linguistics
%C Bangkok, Thailand
%F spangher-etal-2024-tracking
%X Journalists regularly make decisions on whether or not to report stories, based on “news values”. In this work, we wish to explicitly model these decisions to explore _when_ and _why_ certain stories get press attention. This is challenging because very few labelled links between source documents and news articles exist and language use between corpora is very different. We address this problem by implementing a novel _probabilistic relational modeling_ framework, which we show is a low-annotation linking methodology that outperforms other, more state-of-the-art retrieval-based baselines. Next, we define a new task: __newsworthiness prediction__, to predict if a policy item will get covered. We focus on news coverage of local public policy in the San Francisco Bay Area by the _San Francisco Chronicle_. We gather 15k policies discussed across 10 years of public policy meetings, and transcribe over 3,200 hours of public discussion. In general, we find limited impact of public discussion on newsworthiness prediction accuracy, suggesting that some of the most important stories barely get discussed in public.Finally, we show that newsworthiness predictions can be a useful assistive tool for journalists seeking to keep abreast of local government. We perform human evaluation with expert journalists and show our systems identify policies they consider newsworthy with 68% F1 and our coverage recommendations are helpful with an 84% win-rate against baseline. We release all code and data to our work here: https://github.com/alex2awesome/newsworthiness-public.
%R 10.18653/v1/2024.acl-long.763
%U https://aclanthology.org/2024.luhme-long.763/
%U https://doi.org/10.18653/v1/2024.acl-long.763
%P 14150-14168
Markdown (Informal)
[Tracking the Newsworthiness of Public Documents](https://aclanthology.org/2024.luhme-long.763/) (Spangher et al., ACL 2024)
ACL
- Alexander Spangher, Serdar Tumgoren, Ben Welsh, Nanyun Peng, Emilio Ferrara, and Jonathan May. 2024. Tracking the Newsworthiness of Public Documents. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14150–14168, Bangkok, Thailand. Association for Computational Linguistics.