Narayan Acharya
2020
Querying Across Genres for Medical Claims in News
Chaoyuan Zuo
|
Narayan Acharya
|
Ritwik Banerjee
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
We present a query-based biomedical information retrieval task across two vastly different genres – newswire and research literature – where the goal is to find the research publication that supports the primary claim made in a health-related news article. For this task, we present a new dataset of 5,034 claims from news paired with research abstracts. Our approach consists of two steps: (i) selecting the most relevant candidates from a collection of 222k research abstracts, and (ii) re-ranking this list. We compare the classical IR approach using BM25 with more recent transformer-based models. Our results show that cross-genre medical IR is a viable task, but incorporating domain-specific knowledge is crucial.