PARC 3.0: A Corpus of Attribution Relations

Silvia Pareti


Abstract
Quotation and opinion extraction, discourse and factuality have all partly addressed the annotation and identification of Attribution Relations. However, disjoint efforts have provided a partial and partly inaccurate picture of attribution and generated small or incomplete resources, thus limiting the applicability of machine learning approaches. This paper presents PARC 3.0, a large corpus fully annotated with Attribution Relations (ARs). The annotation scheme was tested with an inter-annotator agreement study showing satisfactory results for the identification of ARs and high agreement on the selection of the text spans corresponding to its constitutive elements: source, cue and content. The corpus, which comprises around 20k ARs, was used to investigate the range of structures that can express attribution. The results show a complex and varied relation of which the literature has addressed only a portion. PARC 3.0 is available for research use and can be used in a range of different studies to analyse attribution and validate assumptions as well as to develop supervised attribution extraction models.
Anthology ID:
L16-1619
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3914–3920
Language:
URL:
https://aclanthology.org/L16-1619
DOI:
Bibkey:
Cite (ACL):
Silvia Pareti. 2016. PARC 3.0: A Corpus of Attribution Relations. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 3914–3920, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
PARC 3.0: A Corpus of Attribution Relations (Pareti, LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1619.pdf
Data
MPQA Opinion Corpus