Leveraging Code to Improve In-Context Learning for Semantic Parsing

Ben Bogin, Shivanshu Gupta, Peter Clark, Ashish Sabharwal


Abstract
In-context learning (ICL) is an appealing approach for semantic parsing due to its few-shot nature and improved generalization. However, learning to parse to rare domain-specific languages (DSLs) from just a few demonstrations is challenging, limiting the performance of even the most capable LLMs.In this work, we show how pre-existing coding abilities of LLMs can be leveraged for semantic parsing by (1) using general-purpose programming languages such as Python instead of DSLs and (2) augmenting prompts with a structured domain description that includes, e.g., the available classes and functions. We show that both these changes significantly improve accuracy across three popular datasets; combined, they lead to dramatic improvements (e.g., 7.9% to 66.5% on SMCalFlow compositional split) and can substantially improve compositional generalization, nearly closing the performance gap between easier i.i.d. and harder compositional splits. Finally, comparisons across multiple PLs and DSL variations suggest that the similarity of a target language to general-purpose code is more important than prevalence in pretraining corpora. Our findings provide an improved methodology for building semantic parsers in the modern context of ICL with LLMs.
Anthology ID:
2024.naacl-long.279
Volume:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4971–5012
Language:
URL:
https://aclanthology.org/2024.naacl-long.279
DOI:
Bibkey:
Cite (ACL):
Ben Bogin, Shivanshu Gupta, Peter Clark, and Ashish Sabharwal. 2024. Leveraging Code to Improve In-Context Learning for Semantic Parsing. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 4971–5012, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Leveraging Code to Improve In-Context Learning for Semantic Parsing (Bogin et al., NAACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.naacl-long.279.pdf
Copyright:
 2024.naacl-long.279.copyright.pdf