Varsha Embar


pdf bib
Evaluating Pretrained Transformer Models for Entity Linking inTask-Oriented Dialog
Sai Muralidhar Jayanthi | Varsha Embar | Karthik Raghunathan
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

The wide applicability of pretrained transformer models (PTMs) for natural language tasks is well demonstrated, but their ability to comprehend short phrases of text is less explored. To this end, we evaluate different PTMs from the lens of unsupervised Entity Linking in task-oriented dialog across 5 characteristics– syntactic, semantic, short-forms, numeric and phonetic. Our results demonstrate that several of the PTMs produce sub-par results when compared to traditional techniques, albeit competitive to other neural baselines. We find that some of their shortcomings can be addressed by using PTMs fine-tuned for text-similarity tasks, which illustrate an improved ability in comprehending semantic and syntactic correspondences, as well as some improvements for short-forms, numeric and phonetic variations in entity mentions. We perform qualitative analysis to understand nuances in their predictions and discuss scope for further improvements.


pdf bib
Entity resolution for noisy ASR transcripts
Arushi Raghuvanshi | Vijay Ramakrishnan | Varsha Embar | Lucien Carroll | Karthik Raghunathan
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

Large vocabulary domain-agnostic Automatic Speech Recognition (ASR) systems often mistranscribe domain-specific words and phrases. Since these generic ASR systems are the first component of most voice assistants in production, building Natural Language Understanding (NLU) systems that are robust to these errors can be a challenging task. In this paper, we focus on handling ASR errors in named entities, specifically person names, for a voice-based collaboration assistant. We demonstrate an effective method for resolving person names that are mistranscribed by black-box ASR systems, using character and phoneme-based information retrieval techniques and contextual information, which improves accuracy by 40.8% on our production system. We provide a live interactive demo to further illustrate the nuances of this problem and the effectiveness of our solution.


pdf bib
Towards Inference-Oriented Reading Comprehension: ParallelQA
Soumya Wadhwa | Varsha Embar | Matthias Grabmair | Eric Nyberg
Proceedings of the Workshop on Generalization in the Age of Deep Learning

In this paper, we investigate the tendency of end-to-end neural Machine Reading Comprehension (MRC) models to match shallow patterns rather than perform inference-oriented reasoning on RC benchmarks. We aim to test the ability of these systems to answer questions which focus on referential inference. We propose ParallelQA, a strategy to formulate such questions using parallel passages. We also demonstrate that existing neural models fail to generalize well to this setting.