Fine Grained Citation Span for References in Wikipedia

Besnik Fetahu, Katja Markert, Avishek Anand


Abstract
Verifiability is one of the core editing principles in Wikipedia, where editors are encouraged to provide citations for the added content. For a Wikipedia article determining what content is covered by a citation or the citation span is not trivial, an important aspect for automated citation finding for uncovered content, or fact assessments. We address the problem of determining the citation span in Wikipedia articles. We approach this problem by classifying which textual fragments in an article are covered or hold true given a citation. We propose a sequence classification approach where for a paragraph and a citation, we determine the citation span at a fine-grained level. We provide a thorough experimental evaluation and compare our approach against baselines adopted from the scientific domain, where we show improvement for all evaluation metrics.
Anthology ID:
D17-1212
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Martha Palmer, Rebecca Hwa, Sebastian Riedel
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1990–1999
Language:
URL:
https://aclanthology.org/D17-1212
DOI:
10.18653/v1/D17-1212
Bibkey:
Cite (ACL):
Besnik Fetahu, Katja Markert, and Avishek Anand. 2017. Fine Grained Citation Span for References in Wikipedia. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1990–1999, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Fine Grained Citation Span for References in Wikipedia (Fetahu et al., EMNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/D17-1212.pdf
Video:
 https://aclanthology.org/D17-1212.mp4