Few-Shot Semantic Parsing with Language Models Trained on Code

Richard Shin; Benjamin Van Durme

doi:10.18653/v1/2022.naacl-main.396

Few-Shot Semantic Parsing with Language Models Trained on Code

Abstract

Large language models can perform semantic parsing with little training data, when prompted with in-context examples. It has been shown that this can be improved by formulating the problem as paraphrasing into canonical utterances, which casts the underlying meaning representation into a controlled natural language-like representation. Intuitively, such models can more easily output canonical utterances as they are closer to the natural language used for pre-training. Recently, models also pre-trained on code, like OpenAI Codex, have risen in prominence. For semantic parsing tasks where we map natural language into code, such models may prove more adept at it. In this paper, we test this hypothesis and find that Codex performs better on such tasks than equivalent GPT-3 models. We evaluate on Overnight and SMCalFlow and find that unlike GPT-3, Codex performs similarly when targeting meaning representations directly, perhaps because meaning representations are structured similar to code in these datasets.

Anthology ID:: 2022.naacl-main.396
Volume:: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:: July
Year:: 2022
Address:: Seattle, United States
Editors:: Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5417–5425
Language:
URL:: https://aclanthology.org/2022.naacl-main.396/
DOI:: 10.18653/v1/2022.naacl-main.396
Bibkey:
Cite (ACL):: Richard Shin and Benjamin Van Durme. 2022. Few-Shot Semantic Parsing with Language Models Trained on Code. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5417–5425, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):: Few-Shot Semantic Parsing with Language Models Trained on Code (Shin & Van Durme, NAACL 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.naacl-main.396.pdf
Video:: https://aclanthology.org/2022.naacl-main.396.mp4

PDF Cite Search Video Fix data