RELiC: Retrieving Evidence for Literary Claims

Katherine Thai, Yapei Chang, Kalpesh Krishna, Mohit Iyyer


Abstract
Humanities scholars commonly provide evidence for claims that they make about a work of literature (e.g., a novel) in the form of quotations from the work. We collect a large-scale dataset (RELiC) of 78K literary quotations and surrounding critical analysis and use it to formulate the novel task of literary evidence retrieval, in which models are given an excerpt of literary analysis surrounding a masked quotation and asked to retrieve the quoted passage from the set of all passages in the work. Solving this retrieval task requires a deep understanding of complex literary and linguistic phenomena, which proves challenging to methods that overwhelmingly rely on lexical and semantic similarity matching. We implement a RoBERTa-based dense passage retriever for this task that outperforms existing pretrained information retrieval baselines; however, experiments and analysis by human domain experts indicate that there is substantial room for improvement.
Anthology ID:
2022.acl-long.517
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7500–7518
Language:
URL:
https://aclanthology.org/2022.acl-long.517
DOI:
10.18653/v1/2022.acl-long.517
Bibkey:
Cite (ACL):
Katherine Thai, Yapei Chang, Kalpesh Krishna, and Mohit Iyyer. 2022. RELiC: Retrieving Evidence for Literary Claims. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7500–7518, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
RELiC: Retrieving Evidence for Literary Claims (Thai et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.517.pdf
Code
 martiansideofthemoon/relic-retrieval
Data
RELiCBEIR