2025
pdf
bib
abs
CodeGenWrangler: Data Wrangling task automation using Code-Generating Models
Ashlesha Akella
|
Abhijit Manatkar
|
Krishnasuri Narayanam
|
Sameep Mehta
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)
Assuring the data quality of tabular datasets is essential for the efficiency of the diverse tabular downstream tasks (like summarization and fact-checking). Data-wrangling tasks effectively address the challenges associated with structured data processing to improve the quality of tabular data. Traditional statistical methods handle numeric data efficiently but often fail to understand the semantic context of the textual data in tables. Deep learning approaches are resource-intensive, requiring task and dataset-specific training. Addressing these shortcomings, we present an automated system that leverages LLMs to generate executable code for data-wrangling tasks like missing value imputation, error detection, and error correction. Our system aims to identify inherent patterns in the data while leveraging external knowledge, effectively addressing both memory-independent and memory-dependent tasks.
pdf
bib
abs
Schema and Natural Language Aware In-Context Learning for Improved GraphQL Query Generation
Nitin Gupta
|
Manish Kesarwani
|
Sambit Ghosh
|
Sameep Mehta
|
Carlos Eberhardt
|
Dan Debrunner
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)
GraphQL offers a flexible alternative to REST APIs, allowing precise data retrieval across multiple sources in a single query. However, generating complex GraphQL queries remains a significant challenge. Large Language Models (LLMs), while powerful, often produce suboptimal queries due to limited exposure to GraphQL schemas and their structural intricacies.Custom prompt engineering with in-context examples is a common approach to guide LLMs, but existing methods, like randomly selecting examples, often yield unsatisfactory results. While semantic similarity-based selection is effective in other domains, it falls short for GraphQL, where understanding schema-specific nuances is crucial for accurate query formulation.To address this, we propose a Schema and NL-Aware In-context Learning (SNAIL) framework that integrates both structural and semantic information from GraphQL schemas with natural language inputs, enabling schema-aware in-context learning. Unlike existing methods, our approach captures the complexities of GraphQL schemas to improve query generation accuracy.We validate this framework on a publicly available complex GraphQL test dataset, demonstrating notable performance improvements, with specific query classes showing up to a 20% performance improvement for certain LLMs. As GraphQL adoption grows, with Gartner predicting over 60% of enterprises will use it in production by 2027, this work addresses a critical need, paving the way for more efficient and reliable GraphQL query generation in enterprise applications.
2024
pdf
bib
abs
Sequential API Function Calling Using GraphQL Schema
Avirup Saha
|
Lakshmi Mandal
|
Balaji Ganesan
|
Sambit Ghosh
|
Renuka Sindhgatta
|
Carlos Eberhardt
|
Dan Debrunner
|
Sameep Mehta
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Function calling using Large Language Models (LLMs) is an active research area that aims to empower LLMs with the ability to execute APIs to perform real-world tasks. However, sequential function calling using LLMs with interdependence between functions is still under-explored. To this end, we introduce GraphQLRestBench, a dataset consisting of natural language utterances paired with function call sequences representing real-world REST API calls with variable mapping between functions. In order to represent the response structure of the functions in the LLM prompt, we use the GraphQL schema of the REST APIs. We also introduce a custom evaluation framework for our dataset consisting of four specially designed metrics. We evaluate various open-source LLMs on our dataset using few-shot Chain-of-Thought and ReAct prompting to establish a reasonable baseline.
pdf
bib
abs
GraphQL Query Generation: A Large Training and Benchmarking Dataset
Manish Kesarwani
|
Sambit Ghosh
|
Nitin Gupta
|
Shramona Chakraborty
|
Renuka Sindhgatta
|
Sameep Mehta
|
Carlos Eberhardt
|
Dan Debrunner
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
GraphQL is a powerful query language for APIs that allows clients to fetch precise data efficiently and flexibly, querying multiple resources with a single request. However, crafting complex GraphQL query operations can be challenging. Large Language Models (LLMs) offer an alternative by generating GraphQL queries from natural language, but they struggle due to limited exposure to publicly available GraphQL schemas, often resulting in invalid or suboptimal queries. Furthermore, no benchmark test data suite is available to reliably evaluate the performance of contemporary LLMs.To address this, we present a large-scale, cross-domain Text-to-GraphQL query operation dataset. The dataset includes 10,940 training triples spanning 185 cross-source data stores and 957 test triples over 14 data stores. Each triple consists of a GraphQL schema, GraphQL query operation, and corresponding natural language query. The dataset has been predominantly manually created, with natural language paraphrasing, and carefully validated, requiring approximately 1200 person-hours. In our evaluation, we tested 10 state-of-the-art LLMs using our test dataset. The best-performing model achieved an accuracy of only around 50% with one in-context few-shot example, underscoring the necessity for custom fine-tuning. To support further research and benchmarking, we are releasing the training and test datasets under the MIT License. The dataset is available at https://github.com/stepzen-dev/NL2GQL.
2023
pdf
bib
abs
CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation
Rahul Madhavan
|
Rishabh Garg
|
Kahini Wadhawan
|
Sameep Mehta
Findings of the Association for Computational Linguistics: ACL 2023
We propose a method to control the attributes of Language Models (LMs) for the text generation task using Causal Average Treatment Effect (ATE) scores and counterfactual augmentation. We explore this method, in the context of LM detoxification, and propose the Causally Fair Language (CFL) architecture for detoxifying pre-trained LMs in a plug-and-play manner. Our architecture is based on a Structural Causal Model (SCM) that is mathematically transparent and computationally efficient as compared with many existing detoxification techniques. We also propose several new metrics that aim to better understand the behaviour of LMs in the context of toxic text generation. Further, we achieve state of the art performance for toxic degeneration, which are computed using Real Toxicity Prompts. Our experiments show that CFL achieves such a detoxification without much impact on the model perplexity. We also show that CFL mitigates the unintended bias problem through experiments on the BOLD dataset.
2013
pdf
bib
An Empirical Assessment of Contemporary Online Media in Ad-Hoc Corpus Creation for Social Events
Kanika Narang
|
Seema Nagar
|
Sameep Mehta
|
L V Subramaniam
|
Kuntal Dey
Proceedings of the Sixth International Joint Conference on Natural Language Processing
pdf
bib
NLP for uncertain data at scale
Sameep Mehta
|
L. V. Subramaniam
NAACL HLT 2013 Tutorial Abstracts