Dinesh Khandelwal


pdf bib
Zero-shot Entity Linking with Less Data
G P Shrivatsa Bhargav | Dinesh Khandelwal | Saswati Dana | Dinesh Garg | Pavan Kapanipathi | Salim Roukos | Alexander Gray | L Venkata Subramaniam
Findings of the Association for Computational Linguistics: NAACL 2022

Entity Linking (EL) maps an entity mention in a natural language sentence to an entity in a knowledge base (KB). The Zero-shot Entity Linking (ZEL) extends the scope of EL to unseen entities at the test time without requiring new labeled data. BLINK (BERT-based) is one of the SOTA models for ZEL. Interestingly, we discovered that BLINK exhibits diminishing returns, i.e., it reaches 98% of its performance with just 1% of the training data and the remaining 99% of the data yields only a marginal increase of 2% in the performance. While this extra 2% gain makes a huge difference for downstream tasks, training BLINK on large amounts of data is very resource-intensive and impractical. In this paper, we propose a neuro-symbolic, multi-task learning approach to bridge this gap. Our approach boosts the BLINK’s performance with much less data by exploiting an auxiliary information about entity types. Specifically, we train our model on two tasks simultaneously - entity linking (primary task) and hierarchical entity type prediction (auxiliary task). The auxiliary task exploits the hierarchical structure of entity types. Our approach achieves superior performance on ZEL task with significantly less training data. On four different benchmark datasets, we show that our approach achieves significantly higher performance than SOTA models when they are trained with just 0.01%, 0.1%, or 1% of the original training data. Our code is available at https://github.com/IBM/NeSLET.


pdf bib
Leveraging Abstract Meaning Representation for Knowledge Base Question Answering
Pavan Kapanipathi | Ibrahim Abdelaziz | Srinivas Ravishankar | Salim Roukos | Alexander Gray | Ramón Fernandez Astudillo | Maria Chang | Cristina Cornelio | Saswati Dana | Achille Fokoue | Dinesh Garg | Alfio Gliozzo | Sairam Gurajada | Hima Karanam | Naweed Khan | Dinesh Khandelwal | Young-Suk Lee | Yunyao Li | Francois Luus | Ndivhuwo Makondo | Nandana Mihindukulasooriya | Tahira Naseem | Sumit Neelam | Lucian Popa | Revanth Gangi Reddy | Ryan Riegel | Gaetano Rossiello | Udit Sharma | G P Shrivatsa Bhargav | Mo Yu
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Explanations for CommonsenseQA: New Dataset and Models
Shourya Aggarwal | Divyanshu Mandowara | Vishwajeet Agrawal | Dinesh Khandelwal | Parag Singla | Dinesh Garg
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

CommonsenseQA (CQA) (Talmor et al., 2019) dataset was recently released to advance the research on common-sense question answering (QA) task. Whereas the prior work has mostly focused on proposing QA models for this dataset, our aim is to retrieve as well as generate explanation for a given (question, correct answer choice, incorrect answer choices) tuple from this dataset. Our explanation definition is based on certain desiderata, and translates an explanation into a set of positive and negative common-sense properties (aka facts) which not only explain the correct answer choice but also refute the incorrect ones. We human-annotate a first-of-its-kind dataset (called ECQA) of positive and negative properties, as well as free-flow explanations, for 11K QA pairs taken from the CQA dataset. We propose a latent representation based property retrieval model as well as a GPT-2 based property generation model with a novel two step fine-tuning procedure. We also propose a free-flow explanation generation model. Extensive experiments show that our retrieval model beats BM25 baseline by a relative gain of 100% in F1 score, property generation model achieves a respectable F1 score of 36.4, and free-flow generation model achieves a similarity score of 61.9, where last two scores are based on a human correlated semantic similarity metric.


pdf bib
The TechQA Dataset
Vittorio Castelli | Rishav Chakravarti | Saswati Dana | Anthony Ferritto | Radu Florian | Martin Franz | Dinesh Garg | Dinesh Khandelwal | Scott McCarley | Michael McCawley | Mohamed Nasr | Lin Pan | Cezar Pendus | John Pitrelli | Saurabh Pujar | Salim Roukos | Andrzej Sakrajda | Avi Sil | Rosario Uceda-Sosa | Todd Ward | Rong Zhang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We introduce TECHQA, a domain-adaptation question answering dataset for the technical support domain. The TECHQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size – 600 training, 310 dev, and 490 evaluation question/answer pairs – thus reflecting the cost of creating large labeled datasets with actual data. Hence, TECHQA is meant to stimulate research in domain adaptation rather than as a resource to build QA systems from scratch. TECHQA was obtained by crawling the IBMDeveloper and DeveloperWorks forums for questions with accepted answers provided in an IBM Technote—a technical document that addresses a specific technical issue. We also release a collection of the 801,998 Technotes available on the web as of April 4, 2019 as a companion resource that can be used to learn representations of the IT domain language.