Muhammad Kamran Malik
2022
eRock at Qur’an QA 2022: Contemporary Deep Neural Networks for Qur’an based Reading Comprehension Question Answers
Esha Aftab
|
Muhammad Kamran Malik
Proceedinsg of the 5th Workshop on Open-Source Arabic Corpora and Processing Tools with Shared Tasks on Qur'an QA and Fine-Grained Hate Speech Detection
Question Answering (QA) has enticed the interest of NLP community in recent years. NLP enthusiasts are engineering new Models and fine-tuning the existing ones that can give out answers for the posed questions. The deep neural network models are found to perform exceptionally on QA tasks, but these models are also data intensive. For instance, BERT has outperformed many of its contemporary contenders on SQuAD dataset. In this work, we attempt at solving the closed domain reading comprehension Question Answering task on QRCD (Qur’anic Reading Comprehension Dataset) to extract an answer span from the provided passage, using BERT as a baseline model. We improved the model’s output by applying regularization techniques like weight-decay and data augmentation. Using different strategies we had 0.59% and 0.31% partial Reciprocal Ranking (pRR) on development and testing data splits respectively.
2010
Transliterating Urdu for a Broad-Coverage Urdu/Hindi LFG Grammar
Muhammad Kamran Malik
|
Tafseer Ahmed
|
Sebastian Sulger
|
Tina Bögel
|
Atif Gulzar
|
Ghulam Raza
|
Sarmad Hussain
|
Miriam Butt
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
In this paper, we present a system for transliterating the Arabic-based script of Urdu to a Roman transliteration scheme. The system is integrated into a larger system consisting of a morphology module, implemented via finite state technologies, and a computational LFG grammar of Urdu that was developed with the grammar development platform XLE (Crouch et al. 2008). Our long-term goal is to handle Hindi alongside Urdu; the two languages are very similar with respect to syntax and lexicon and hence, one grammar can be used to cover both languages. However, they are not similar concerning the script -- Hindi is written in Devanagari, while Urdu uses an Arabic-based script. By abstracting away to a common Roman transliteration scheme in the respective transliterators, our system can be enabled to handle both languages in parallel. In this paper, we discuss the pipeline architecture of the Urdu-Roman transliterator, mention several linguistic and orthographic issues and present the integration of the transliterator into the LFG parsing system.
Search
Fix data
Co-authors
- Esha Aftab 1
- Tafseer Ahmed 1
- Miriam Butt 1
- Tina Bögel 1
- Atif Gulzar 1
- show all...