Uncovering Code-Mixed Challenges: A Framework for Linguistically Driven Question Generation and Neural Based Question Answering

Deepak Gupta, Pabitra Lenka, Asif Ekbal, Pushpak Bhattacharyya


Abstract
Existing research on question answering (QA) and comprehension reading (RC) are mainly focused on the resource-rich language like English. In recent times, the rapid growth of multi-lingual web content has posed several challenges to the existing QA systems. Code-mixing is one such challenge that makes the task more complex. In this paper, we propose a linguistically motivated technique for code-mixed question generation (CMQG) and a neural network based architecture for code-mixed question answering (CMQA). For evaluation, we manually create the code-mixed questions for Hindi-English language pair. In order to show the effectiveness of our neural network based CMQA technique, we utilize two benchmark datasets, SQuAD and MMQA. Experiments show that our proposed model achieves encouraging performance on CMQG and CMQA.
Anthology ID:
K18-1012
Volume:
Proceedings of the 22nd Conference on Computational Natural Language Learning
Month:
October
Year:
2018
Address:
Brussels, Belgium
Editors:
Anna Korhonen, Ivan Titov
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
119–130
Language:
URL:
https://aclanthology.org/K18-1012
DOI:
10.18653/v1/K18-1012
Bibkey:
Cite (ACL):
Deepak Gupta, Pabitra Lenka, Asif Ekbal, and Pushpak Bhattacharyya. 2018. Uncovering Code-Mixed Challenges: A Framework for Linguistically Driven Question Generation and Neural Based Question Answering. In Proceedings of the 22nd Conference on Computational Natural Language Learning, pages 119–130, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Uncovering Code-Mixed Challenges: A Framework for Linguistically Driven Question Generation and Neural Based Question Answering (Gupta et al., CoNLL 2018)
Copy Citation:
PDF:
https://aclanthology.org/K18-1012.pdf
Data
SQuAD