Ananya Singha
2024
MetaReflection: Learning Instructions for Language Agents using Past Reflections
Priyanshu Gupta
|
Shashank Kirtania
|
Ananya Singha
|
Sumit Gulwani
|
Arjun Radhakrishna
|
Gustavo Soares
|
Sherry Shi
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
The popularity of Large Language Models (LLMs) have unleashed a new age of Language Agents for solving a diverse range of tasks. While contemporary frontier LLMs are capable enough to power reasonably good Language agents, the closed-API model makes it hard to improve in cases they perform sub-optimally. To address this, recent works have explored techniques to improve their performance using self reflection and prompt optimization techniques. While techniques like self reflection work well in an online setup, contemporary prompt optimization techniques are designed to work on simpler tasks. To address this, we introduce METAREFLECTION, a novel offline reinforcement learning technique that enhances the performance of Language Agents by augmenting a semantic memory based on experiential learnings from past trials. We demonstrate the efficacy of METAREFLECTION by evaluating across multiple domains, including complex logical reasoning, biomedical semantic similarity, open world question answering, and vulnerability threat detection, in Infrastructure-as-Code, with different agent design. METAREFLECTION boosts Language agents’ performance by 4 % to 16.82 % over the raw GPT-4 baseline and performs on par with existing state-of-the-art prompt optimization techniques while requiring fewer LLM calls.
2023
TSTR: Target Similarity Tuning Meets the Real World
Anirudh Khatry
|
Sumit Gulwani
|
Priyanshu Gupta
|
Vu Le
|
Mukul Singh
|
Ananya Singha
|
Gust Verbruggen
Findings of the Association for Computational Linguistics: EMNLP 2023
Target similarity tuning (TST) is a method of selecting relevant examples in natural language (NL) to code generation through large language models (LLMs) to improve performance. Its goal is to adapt a sentence embedding model to have the similarity between two NL inputs match the similarity between their associated code outputs. In this paper, we propose different methods to apply and improve TST in the real world. First, we replace the sentence transformer with embeddings from a larger model, which reduces sensitivity to the language distribution and thus provides more flexibility in synthetic generation of examples, and we train a tiny model that transforms these embeddings to a space where embedding similarity matches code similarity, which allows the model to remain a black box and only requires a few matrix multiplications at inference time. Second, we how to efficiently select a smaller number of training examples to train the TST model. Third, we introduce a ranking-based evaluation for TST that does not require end-to-end code generation experiments, which can be expensive to perform.
Search
Fix data
Co-authors
- Sumit Gulwani 2
- Priyanshu Gupta 2
- Anirudh Khatry 1
- Shashank Kirtania 1
- Vu Le 1
- show all...