Treepiece: Faster Semantic Parsing via Tree Tokenization

Sid Wang, Akshat Shrivastava, Aleksandr Livshits


Abstract
Autoregressive (AR) encoder-decoder neural networks have proved successful in many NLP problems, including Semantic Parsing – a task that translates natural language to machine-readable parse trees. However, the sequential prediction process of AR models can be slow. To accelerate AR for semantic parsing, we introduce a new technique called TreePiece that tokenizes a parse tree into subtrees and generates one subtree per decoding step. On TOPv2 benchmark, TreePiece shows 4.6 times faster decoding speed than standard AR, and comparable speed but significantly higher accuracy compared to Non-Autoregressive (NAR).
Anthology ID:
2023.findings-emnlp.740
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11082–11092
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.740
DOI:
10.18653/v1/2023.findings-emnlp.740
Bibkey:
Cite (ACL):
Sid Wang, Akshat Shrivastava, and Aleksandr Livshits. 2023. Treepiece: Faster Semantic Parsing via Tree Tokenization. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 11082–11092, Singapore. Association for Computational Linguistics.
Cite (Informal):
Treepiece: Faster Semantic Parsing via Tree Tokenization (Wang et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.740.pdf