Exploring Input Representation Granularity for Generating Questions Satisfying Question-Answer Congruence

Madeeswaran Kannan, Haemanth Santhi Ponnusamy, Kordula De Kuthy, Lukas Stein, Detmar Meurers


Abstract
In question generation, the question produced has to be well-formed and meaningfully related to the answer serving as input. Neural generation methods have predominantly leveraged the distributional semantics of words as representations of meaning and generated questions one word at a time. In this paper, we explore the viability of form-based and more fine-grained encodings, such as character or subword representations for question generation. We start from the typical seq2seq architecture using word embeddings presented by De Kuthy et al. (2020), who generate questions from text so that the answer given in the input text matches not just in meaning but also in form, satisfying question-answer congruence. We show that models trained on character and subword representations substantially outperform the published results based on word embeddings, and they do so with fewer parameters. Our approach eliminates two important problems of the word-based approach: the encoding of rare or out-of-vocabulary words and the incorrect replacement of words with semantically-related ones. The character-based model substantially improves on the published results, both in terms of BLEU scores and regarding the quality of the generated question. Going beyond the specific task, this result adds to the evidence weighing different form- and meaning-based representations for natural language processing tasks.
Anthology ID:
2021.inlg-1.3
Volume:
Proceedings of the 14th International Conference on Natural Language Generation
Month:
August
Year:
2021
Address:
Aberdeen, Scotland, UK
Editors:
Anya Belz, Angela Fan, Ehud Reiter, Yaji Sripada
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
24–34
Language:
URL:
https://aclanthology.org/2021.inlg-1.3
DOI:
10.18653/v1/2021.inlg-1.3
Bibkey:
Cite (ACL):
Madeeswaran Kannan, Haemanth Santhi Ponnusamy, Kordula De Kuthy, Lukas Stein, and Detmar Meurers. 2021. Exploring Input Representation Granularity for Generating Questions Satisfying Question-Answer Congruence. In Proceedings of the 14th International Conference on Natural Language Generation, pages 24–34, Aberdeen, Scotland, UK. Association for Computational Linguistics.
Cite (Informal):
Exploring Input Representation Granularity for Generating Questions Satisfying Question-Answer Congruence (Kannan et al., INLG 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.inlg-1.3.pdf
Data
SQuAD