Determining Question-Answer Plausibility in Crowdsourced Datasets Using Multi-Task Learning

Rachel Gardner, Maya Varma, Clare Zhu, Ranjay Krishna


Abstract
Datasets extracted from social networks and online forums are often prone to the pitfalls of natural language, namely the presence of unstructured and noisy data. In this work, we seek to enable the collection of high-quality question-answer datasets from social media by proposing a novel task for automated quality analysis and data cleaning: question-answer (QA) plausibility. Given a machine or user-generated question and a crowd-sourced response from a social media user, we determine if the question and response are valid; if so, we identify the answer within the free-form response. We design BERT-based models to perform the QA plausibility task, and we evaluate the ability of our models to generate a clean, usable question-answer dataset. Our highest-performing approach consists of a single-task model which determines the plausibility of the question, followed by a multi-task model which evaluates the plausibility of the response as well as extracts answers (Question Plausibility AUROC=0.75, Response Plausibility AUROC=0.78, Answer Extraction F1=0.665).
Anthology ID:
2020.wnut-1.4
Volume:
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)
Month:
November
Year:
2020
Address:
Online
Editors:
Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
Venue:
WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
22–27
Language:
URL:
https://aclanthology.org/2020.wnut-1.4
DOI:
10.18653/v1/2020.wnut-1.4
Bibkey:
Cite (ACL):
Rachel Gardner, Maya Varma, Clare Zhu, and Ranjay Krishna. 2020. Determining Question-Answer Plausibility in Crowdsourced Datasets Using Multi-Task Learning. In Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020), pages 22–27, Online. Association for Computational Linguistics.
Cite (Informal):
Determining Question-Answer Plausibility in Crowdsourced Datasets Using Multi-Task Learning (Gardner et al., WNUT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wnut-1.4.pdf
Code
 rachel-1/qa_plausibility