Fact or Fiction: Verifying Scientific Claims
David Wadden | Shanchuan Lin | Kyle Lo | Lucy Lu Wang | Madeleine van Zuylen | Arman Cohan | Hannaneh Hajishirzi
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
We introduce scientific claim verification, a new task to select abstracts from the research literature containing evidence that SUPPORTS or REFUTES a given scientific claim, and to identify rationales justifying each decision. To study this task, we construct SciFact, a dataset of 1.4K expert-written scientific claims paired with evidence-containing abstracts annotated with labels and rationales. We develop baseline models for SciFact, and demonstrate that simple domain adaptation techniques substantially improve performance compared to models trained on Wikipedia or political news. We show that our system is able to verify claims related to COVID-19 by identifying evidence from the CORD-19 corpus. Our experiments indicate that SciFact will provide a challenging testbed for the development of new systems designed to retrieve and reason over corpora containing specialized domain knowledge. Data and code for this new task are publicly available at https://github.com/allenai/scifact. A leaderboard and COVID-19 fact-checking demo are available at https://scifact.apps.allenai.org.
MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms
Aida Amini | Saadia Gabriel | Shanchuan Lin | Rik Koncel-Kedziorski | Yejin Choi | Hannaneh Hajishirzi
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
We introduce a large-scale dataset of math word problems and an interpretable neural math problem solver by learning to map problems to their operation programs. Due to annotation challenges, current datasets in this domain have been either relatively small in scale or did not offer precise operational annotations over diverse problem types. We introduce a new representation language to model operation programs corresponding to each math problem that aim to improve both the performance and the interpretability of the learned models. Using this representation language, we significantly enhance the AQUA-RAT dataset with fully-specified operational programs. We additionally introduce a neural sequence-to-program model with automatic problem categorization. Our experiments show improvements over competitive baselines in our dataset as well as the AQUA-RAT dataset. The results are still lower than human performance indicating that the dataset poses new challenges for future research. Our dataset is available at https://math-qa.github.io/math-QA/
- Hannaneh Hajishirzi 2
- Aida Amini 1
- Saadia Gabriel 1
- Rik Koncel-Kedziorski 1
- Yejin Choi 1
- show all...