A Comprehensive Resource to Evaluate Complex Open Domain Question Answering

Silvia Quarteroni, Alessandro Moschitti


Abstract
Complex Question Answering is a discipline that involves a deep understanding of question/answer relations, such as those characterizing definition and procedural questions and their answers. To contribute to the improvement of this technology, we deliver two question and answer corpora for complex questions, WEB-QA and TREC-QA, extracted by the same Question Answering system, YourQA, from the Web and from the AQUAINT-6 data collection respectively. We believe that such corpora can be useful resources to address a type of QA that is far from being efficiently solved. WEB-QA and TREC-QA are available in two formats: judgment files and training/testing files. Judgment files contain a ranked list of candidate answers to TREC-10 complex questions, extracted using YourQA as a baseline system and manually labelled according to a Likert scale from 1 (completely incorrect) to 5 (totally correct). Training and testing files contain learning instances compatible with SVM-light; these are useful for experimenting with shallow and complex structural features such as parse trees and semantic role labels. Our experiments with the above corpora have allowed to prove that structured information representation is useful to improve the accuracy of complex QA systems and to re-rank answers.
Anthology ID:
L10-1126
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/188_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Silvia Quarteroni and Alessandro Moschitti. 2010. A Comprehensive Resource to Evaluate Complex Open Domain Question Answering. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
A Comprehensive Resource to Evaluate Complex Open Domain Question Answering (Quarteroni & Moschitti, LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/188_Paper.pdf