An Evaluation of Information Extraction Tools for Identifying Health Claims in News Headlines

Shi Yuan, Bei Yu


Abstract
This study evaluates the performance of four information extraction tools (extractors) on identifying health claims in health news headlines. A health claim is defined as a triplet: IV (what is being manipulated), DV (what is being measured) and their relation. Tools that can identify health claims provide the foundation for evaluating the accuracy of these claims against authoritative resources. The evaluation result shows that 26% headlines do not in-clude health claims, and all extractors face difficulty separating them from the rest. For those with health claims, OPENIE-5.0 performed the best with F-measure at 0.6 level for ex-tracting “IV-relation-DV”. However, the characteristic linguistic structures in health news headlines, such as incomplete sentences and non-verb relations, pose particular challenge to existing tools.
Anthology ID:
W18-4305
Volume:
Proceedings of the Workshop Events and Stories in the News 2018
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, U.S.A
Editors:
Tommaso Caselli, Ben Miller, Marieke van Erp, Piek Vossen, Martha Palmer, Eduard Hovy, Teruko Mitamura, David Caswell, Susan W. Brown, Claire Bonial
Venue:
EventStory
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
34–43
Language:
URL:
https://aclanthology.org/W18-4305/
DOI:
Bibkey:
Cite (ACL):
Shi Yuan and Bei Yu. 2018. An Evaluation of Information Extraction Tools for Identifying Health Claims in News Headlines. In Proceedings of the Workshop Events and Stories in the News 2018, pages 34–43, Santa Fe, New Mexico, U.S.A. Association for Computational Linguistics.
Cite (Informal):
An Evaluation of Information Extraction Tools for Identifying Health Claims in News Headlines (Yuan & Yu, EventStory 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-4305.pdf