newsLens: building and visualizing long-ranging news stories

Philippe Laban, Marti Hearst


Abstract
We propose a method to aggregate and organize a large, multi-source dataset of news articles into a collection of major stories, and automatically name and visualize these stories in a working system. The approach is able to run online, as new articles are added, processing 4 million news articles from 20 news sources, and extracting 80000 major stories, some of which span several years. The visual interface consists of lanes of timelines, each annotated with information that is deemed important for the story, including extracted quotations. The working system allows a user to search and navigate 8 years of story information.
Anthology ID:
W17-2701
Volume:
Proceedings of the Events and Stories in the News Workshop
Month:
August
Year:
2017
Address:
Vancouver, Canada
Editors:
Tommaso Caselli, Ben Miller, Marieke van Erp, Piek Vossen, Martha Palmer, Eduard Hovy, Teruko Mitamura, David Caswell
Venue:
EventStory
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–9
Language:
URL:
https://aclanthology.org/W17-2701
DOI:
10.18653/v1/W17-2701
Bibkey:
Cite (ACL):
Philippe Laban and Marti Hearst. 2017. newsLens: building and visualizing long-ranging news stories. In Proceedings of the Events and Stories in the News Workshop, pages 1–9, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
newsLens: building and visualizing long-ranging news stories (Laban & Hearst, EventStory 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-2701.pdf