Cassiana Roberta Lizzoni Michelin
2024
Evaluation of Question Answer Generation for Portuguese: Insights and Datasets
Felipe Paula
|
Cassiana Roberta Lizzoni Michelin
|
Viviane Moreira
Findings of the Association for Computational Linguistics: EMNLP 2024
Automatic question generation is an increasingly important task that can be applied in different settings, including educational purposes, data augmentation for question-answering (QA), and conversational systems. More specifically, we focus on question answer generation (QAG), which produces question-answer pairs given an input context. We adapt and apply QAG approaches to generate question-answer pairs for different domains and assess their capacity to generate accurate, diverse, and abundant question-answer pairs. Our analyses combine both qualitative and quantitative evaluations that allow insights into the quality and types of errors made by QAG methods. We also look into strategies for error filtering and their effects. Our work concentrates on Portuguese, a widely spoken language that is underrepresented in natural language processing research. To address the pressing need for resources, we generate and make available human-curated extractive QA datasets in three diverse domains.
Search