Ian Campbell


2024

To appropriately interpret and use scientific claims for sensemaking and decision-making, it is critical to contextualize them, not just with textual evidence that the claim was in fact asserted, but also with key supporting empirical evidence, such as a figure that describes a key result, and methodological details, such as the methods of data collection. Retrieving this contextual information when encountering claims in isolation, away from their source papers, is difficult and time-consuming for humans. Scholarly document processing models could help to contextualize scientific claims, but there is a lack of datasets designed for this task. Thus, we contribute a dataset of 585 scientific claims with gold annotations for supporting figures and tables, and gold text snippets of methodological details, that ground the key results behind each claim and run the Context24 shared task to encourage model development for this task. This report describes details of our dataset construction process, summarizes results from the shared task conducted at the 4th Workshop on Scholarly Document Processing (SDP), and discusses future research directions in this space. To support further research, we also publicly release the dataset on HuggingFace.