Christoph Miksovic


2024

pdf bib
Adapting LLMs for Structured Natural Language API Integration
Robin Chan | Katsiaryna Mirylenka | Thomas Gschwind | Christoph Miksovic | Paolo Scotton | Enrico Toniato | Abdel Labbi
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

API integration is crucial for enterprise systems, as it enables seamless interaction between applications within workflows. However, the diversity and complexity of the API landscape present significant challenges in combining API calls based on user intent.Existing methods rely on named entity recognition (NER) and knowledge graphs, but struggle to generate more complex control flow structures, such as conditionals and loops.We propose a novel framework that leverages the success of large language models (LLMs) in code generation to integrate APIs based on natural language input. Our approach involves fine-tuning an LLM using automatically generated API flows derived from OpenAPI specifications.We further evaluate the effectiveness of enforcing the syntax and schema adherence through constrained decoding.To enable systematic comparison, we introduce targeted test suites to assess the generalization capabilities of these approaches and their ability to retain structured knowledge.Our findings show that LLMs fine-tuned on OpenAPI specifications can (a) learn structural API constraints implicitly during training, and (b) achieve significant improvements in both in-distribution and out-of-distribution performance over NER and retrieval-augmented generation (RAG)-based approaches.