Ashwin Srinivasan

2025

pdf bib abs
Concept Distillation from Strong to Weak Models via Hypotheses-to-Theories Prompting
Emmanuel Aboah Boateng | Cassiano O Becker | Nabiha Asghar | Kabir Walia | Ashwin Srinivasan | Ehi Nosakhare | Soundararajan Srinivasan | Victor Dibia
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)

Hand-crafting high quality prompts to optimize the performance of language models is a complicated and labor-intensive process. Furthermore, when migrating to newer, smaller, or weaker models (possibly due to latency or cost gains), prompts need to be updated to re-optimize the task performance. We propose Concept Distillation (CD), an automatic prompt optimization technique for enhancing weaker models on complex tasks. CD involves: (1) collecting mistakes made by weak models with a base prompt (initialization), (2) using a strong model to generate reasons for these mistakes and create rules/concepts for weak models (induction), and (3) filtering these rules based on validation set performance and integrating them into the base prompt (deduction/verification). We evaluated CD on NL2Code and mathematical reasoning tasks, observing significant performance boosts for small and weaker language models. Notably, Mistral-7B’s accuracy on Multi-Arith increased by 20%, and Phi-3-mini-3.8B’s accuracy on HumanEval rose by 34%. Compared to other automated methods, CD offers an effective, cost-efficient strategy for improving weak models’ performance on complex tasks and enables seamless workload migration across different language models without compromising performance.

2024

pdf bib abs
AutoRef: Generating Refinements of Reviews Given Guidelines
Soham Chitnis | Manasi Patwardhan | Ashwin Srinivasan | Tanmay Tulsidas Verlekar | Lovekesh Vig | Gautam Shroff
Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024)

When examining reviews of research papers, we can distinguish between two hypothetical referees: the maximally lenient referee who accepts any paper with a vacuous review and the maximally strict one who rejects any paper with an overly pedantic review. Clearly, both are of no practical value. Our interest is in a referee who makes a balanced judgement and provides a review abiding by the guidelines. In this paper, we present a case study of automatic correction of an existing machine-generated or human review. The AutoRef\ system implements an iterative approach that progressively “refines” a review by attempting to make it more compliant with pre-defined requirements of a “good” review. It implements the following steps: (1) Translate the review requirements into a specification in natural language, of “yes/no” questions; (2) Given a (paper,review) pair, extract answers to the questions; (3) Use the results in (2) to generate a new review; and (4) Return to Step (2) with the paper and the new review. Here, (2) and (3) are implemented by large language model (LLM) based agents. We present a case study using papers and reviews made available for the International Conference on Learning Representations (ICLR). Our initial empirical results suggest that AutoRef\ progressively improves the compliance of the generated reviews to the specification. Currently designed specification makes AutoRef\ progressively generate reviews which are stricter, making the decisions more inclined towards “rejections”. This demonstrates the applicability of $AutoRef $ for: (1) The progressive correction of overly lenient reviews, being useful for referees and meta-reviewers; and (2) The generation of progressively stricter reviews for a paper, starting from a vacuous review (“Great paper. Accept.”), facilitating authors when trying to assess weaknesses in their papers.

2022

Dense retrieval (DR) methods conduct text retrieval by first encoding texts in the embedding space and then matching them by nearest neighbor search. This requires strong locality properties from the representation space, e.g., close allocations of each small group of relevant texts, which are hard to generalize to domains without sufficient training data. In this paper, we aim to improve the generalization ability of DR models from source training domains with rich supervision signals to target domains without any relevance label, in the zero-shot setting. To achieve that, we propose Momentum adversarial Domain Invariant Representation learning (MoDIR), which introduces a momentum method to train a domain classifier that distinguishes source versus target domains, and then adversarially updates the DR encoder to learn domain invariant representations. Our experiments show that MoDIR robustly outperforms its baselines on 10+ ranking datasets collected in the BEIR benchmark in the zero-shot setup, with more than 10% relative gains on datasets with enough sensitivity for DR models’ evaluation. Source code is available at https://github.com/ji-xin/modir.

2019

pdf bib abs
Dr.Quad at MEDIQA 2019: Towards Textual Inference and Question Entailment using contextualized representations
Vinayshekhar Bannihatti Kumar | Ashwin Srinivasan | Aditi Chaudhary | James Route | Teruko Mitamura | Eric Nyberg
Proceedings of the 18th BioNLP Workshop and Shared Task

This paper presents the submissions by TeamDr.Quad to the ACL-BioNLP 2019 shared task on Textual Inference and Question Entailment in the Medical Domain. Our system is based on the prior work Liu et al. (2019) which uses a multi-task objective function for textual entailment. In this work, we explore different strategies for generalizing state-of-the-art language understanding models to the specialized medical domain. Our results on the shared task demonstrate that incorporating domain knowledge through data augmentation is a powerful strategy for addressing challenges posed specialized domains such as medicine.

2007

pdf bib
USP-IBM-1 and USP-IBM-2: The ILP-based Systems for Lexical Sample WSD in SemEval-2007
Lucia Specia | Maria das Graças | Volpe Nunes | Ashwin Srinivasan | Ganesh Ramakrishnan
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

Co-authors

Venues

bionlp1
findings1
naacl1
sdp1
semeval1
show all...

ws1

Fix author