FreebaseQA: A New Factoid QA Data Set Matching Trivia-Style Question-Answer Pairs with Freebase

Kelvin Jiang; Dekun Wu; Hui Jiang

doi:10.18653/v1/N19-1028

FreebaseQA: A New Factoid QA Data Set Matching Trivia-Style Question-Answer Pairs with Freebase

Abstract

In this paper, we present a new data set, named FreebaseQA, for open-domain factoid question answering (QA) tasks over structured knowledge bases, like Freebase. The data set is generated by matching trivia-type question-answer pairs with subject-predicate-object triples in Freebase. For each collected question-answer pair, we first tag all entities in each question and search for relevant predicates that bridge a tagged entity with the answer in Freebase. Finally, human annotation is used to remove any false positive in these matched triples. Using this method, we are able to efficiently generate over 54K matches from about 28K unique questions with minimal cost. Our analysis shows that this data set is suitable for model training in factoid QA tasks beyond simpler questions since FreebaseQA provides more linguistically sophisticated questions than other existing data sets.

Anthology ID:: N19-1028
Volume:: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:: June
Year:: 2019
Address:: Minneapolis, Minnesota
Editors:: Jill Burstein, Christy Doran, Thamar Solorio
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 318–323
Language:
URL:: https://aclanthology.org/N19-1028/
DOI:: 10.18653/v1/N19-1028
Bibkey:
Cite (ACL):: Kelvin Jiang, Dekun Wu, and Hui Jiang. 2019. FreebaseQA: A New Factoid QA Data Set Matching Trivia-Style Question-Answer Pairs with Freebase. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 318–323, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):: FreebaseQA: A New Factoid QA Data Set Matching Trivia-Style Question-Answer Pairs with Freebase (Jiang et al., NAACL 2019)
Copy Citation:
PDF:: https://aclanthology.org/N19-1028.pdf

PDF Cite Search Fix data