Dawn Sepehr


2021

pdf bib
Active Curriculum Learning
Borna Jafarpour | Dawn Sepehr | Nick Pogrebnyakov
Proceedings of the First Workshop on Interactive Learning for Natural Language Processing

This paper investigates and reveals the relationship between two closely related machine learning disciplines, namely Active Learning (AL) and Curriculum Learning (CL), from the lens of several novel curricula. This paper also introduces Active Curriculum Learning (ACL) which improves AL by combining AL with CL to benefit from the dynamic nature of the AL informativeness concept as well as the human insights used in the design of the curriculum heuristics. Comparison of the performance of ACL and AL on two public datasets for the Named Entity Recognition (NER) task shows the effectiveness of combining AL and CL using our proposed framework.

2020

pdf bib
A Smart System to Generate and Validate Question Answer Pairs for COVID-19 Literature
Rohan Bhambhoria | Luna Feng | Dawn Sepehr | John Chen | Conner Cowling | Sedef Kocak | Elham Dolatabadi
Proceedings of the First Workshop on Scholarly Document Processing

Automatically generating question answer (QA) pairs from the rapidly growing coronavirus-related literature is of great value to the medical community. Creating high quality QA pairs would allow researchers to build models to address scientific queries for answers which are not readily available in support of the ongoing fight against the pandemic. QA pair generation is, however, a very tedious and time consuming task requiring domain expertise for annotation and evaluation. In this paper we present our contribution in addressing some of the challenges of building a QA system without gold data. We first present a method to create QA pairs from a large semi-structured dataset through the use of transformer and rule-based models. Next, we propose a means of engaging subject matter experts (SMEs) for annotating the QA pairs through the usage of a web application. Finally, we demonstrate some experiments showcasing the effectiveness of leveraging active learning in designing a high performing model with a substantially lower annotation effort from the domain experts.