A Simpler and More Generalizable Story Detector using Verb and Character Features

Joshua Eisenberg, Mark Finlayson


Abstract
Story detection is the task of determining whether or not a unit of text contains a story. Prior approaches achieved a maximum performance of 0.66 F1, and did not generalize well across different corpora. We present a new state-of-the-art detector that achieves a maximum performance of 0.75 F1 (a 14% improvement), with significantly greater generalizability than previous work. In particular, our detector achieves performance above 0.70 F1 across a variety of combinations of lexically different corpora for training and testing, as well as dramatic improvements (up to 4,000%) in performance when trained on a small, disfluent data set. The new detector uses two basic types of features–ones related to events, and ones related to characters–totaling 283 specific features overall; previous detectors used tens of thousands of features, and so this detector represents a significant simplification along with increased performance.
Anthology ID:
D17-1287
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Martha Palmer, Rebecca Hwa, Sebastian Riedel
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2708–2715
Language:
URL:
https://aclanthology.org/D17-1287
DOI:
10.18653/v1/D17-1287
Bibkey:
Cite (ACL):
Joshua Eisenberg and Mark Finlayson. 2017. A Simpler and More Generalizable Story Detector using Verb and Character Features. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2708–2715, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
A Simpler and More Generalizable Story Detector using Verb and Character Features (Eisenberg & Finlayson, EMNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/D17-1287.pdf