Vihari Piratla

2025

LLMs on interactive feature collections with implicit dynamic decision strategy
Juyeon Heo | Vihari Piratla | Kyunghyun Lee | Hyonkeun Joh | Adrian Weller
Proceedings of the 31st International Conference on Computational Linguistics

In real-world contexts such as medical diagnosis and business consulting, effective problem-solving often requires gathering relevant information through interactions and targeted questioning to pinpoint the root cause of a problem. However, Large Language Models (LLMs) often struggle to efficiently narrow down the search space, leading to either missing key information or asking redundant questions when guided by implicit methods like Chain-of-Thought (CoT). Some approaches employ external engineered systems to guide reasoning paths, but these methods may not fully utilize the inherent problem-solving capabilities of LLMs and often require multiple expensive API calls. This study explores how we can implicitly guide LLMs to enhance their interactive feature collection abilities within a single prompt. Instead of employing explicit search algorithms or step-by-step external guidance, we provide high-level guidelines that allow LLMs to dynamically adjust their strategies and iteratively refine their decision-making processes independently. Evaluations on synthetic 20-Questions games and real-world scenarios, including business and medical diagnosis cases, demonstrate that LLMs guided by these strategies perform more effective interactive feature collection, asking fewer and more strategic questions and achieving better problem-solving efficiency.

2020

pdf bib abs

NLP Service APIs and Models for Efficient Registration of New Clients
Sahil Shah | Vihari Piratla | Soumen Chakrabarti | Sunita Sarawagi
Findings of the Association for Computational Linguistics: EMNLP 2020

State-of-the-art NLP inference uses enormous neural architectures and models trained for GPU-months, well beyond the reach of most consumers of NLP. This has led to one-size-fits-all public API-based NLP service models by major AI companies, serving millions of clients. They cannot afford traditional fine tuning for individual clients. Many clients cannot even afford significant fine tuning, and own little or no labeled data. Recognizing that word usage and salience diversity across clients leads to reduced accuracy, we initiate a study of practical and lightweight adaptation of centralized NLP services to clients. Each client uses an unsupervised, corpus-based sketch to register to the service. The server modifies its network mildly to accommodate client sketches, and occasionally trains the augmented network over existing clients. When a new client registers with its sketch, it gets immediate accuracy benefits. We demonstrate the proposed architecture using sentiment labeling, NER, and predictive language modeling.

2019

pdf bib abs

Topic Sensitive Attention on Generic Corpora Corrects Sense Bias in Pretrained Embeddings
Vihari Piratla | Sunita Sarawagi | Soumen Chakrabarti
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Given a small corpus D_T pertaining to a limited set of focused topics, our goal is to train embeddings that accurately capture the sense of words in the topic in spite of the limited size of D_T. These embeddings may be used in various tasks involving D_T. A popular strategy in limited data settings is to adapt pretrained embeddings E trained on a large corpus. To correct for sense drift, fine-tuning, regularization, projection, and pivoting have been proposed recently. Among these, regularization informed by a word’s corpus frequency performed well, but we improve upon it using a new regularizer based on the stability of its cooccurrence with other words. However, a thorough comparison across ten topics, spanning three tasks, with standardized settings of hyper-parameters, reveals that even the best embedding adaptation strategies provide small gains beyond well-tuned baselines, which many earlier comparisons ignored. In a bold departure from adapting pretrained embeddings, we propose using D_T to probe, attend to, and borrow fragments from any large, topic-rich source corpus (such as Wikipedia), which need not be the corpus used to pretrain embeddings. This step is made scalable and practical by suitable indexing. We reach the surprising conclusion that even limited corpus augmentation is more useful than adapting embeddings, which suggests that non-dominant sense information may be irrevocably obliterated from pretrained embeddings and cannot be salvaged by adaptation.

pdf bib abs

Parallel Iterative Edit Models for Local Sequence Transduction
Abhijeet Awasthi | Sunita Sarawagi | Rasna Goyal | Sabyasachi Ghosh | Vihari Piratla
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We present a Parallel Iterative Edit (PIE) model for the problem of local sequence transduction arising in tasks like Grammatical error correction (GEC). Recent approaches are based on the popular encoder-decoder (ED) model for sequence to sequence learning. The ED model auto-regressively captures full dependency among output tokens but is slow due to sequential decoding. The PIE model does parallel decoding, giving up the advantage of modeling full dependency in the output, yet it achieves accuracy competitive with the ED model for four reasons: 1. predicting edits instead of tokens, 2. labeling sequences instead of generating sequences, 3. iteratively refining predictions to capture dependencies, and 4. factorizing logits over edits and their token argument to harness pre-trained language models like BERT. Experiments on tasks spanning GEC, OCR correction and spell correction demonstrate that the PIE model is an accurate and significantly faster alternative for local sequence transduction.

Co-authors

Venues

Fix author