Efficient Constituency Tree based Encoding for Natural Language to Bash Translation

Shikhar Bharadwaj, Shirish Shevade


Abstract
Bash is a Unix command language used for interacting with the Operating System. Recent works on natural language to Bash translation have made significant advances, but none of the previous methods utilize the problem’s inherent structure. We identify this structure andpropose a Segmented Invocation Transformer (SIT) that utilizes the information from the constituency parse tree of the natural language text. Our method is motivated by the alignment between segments in the natural language text and Bash command components. Incorporating the structure in the modelling improves the performance of the model. Since such systems must be universally accessible, we benchmark the inference times on a CPU rather than a GPU. We observe a 1.8x improvement in the inference time and a 5x reduction in model parameters. Attribution analysis using Integrated Gradients reveals that the proposed method can capture the problem structure.
Anthology ID:
2022.naacl-main.230
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3159–3168
Language:
URL:
https://aclanthology.org/2022.naacl-main.230
DOI:
10.18653/v1/2022.naacl-main.230
Bibkey:
Cite (ACL):
Shikhar Bharadwaj and Shirish Shevade. 2022. Efficient Constituency Tree based Encoding for Natural Language to Bash Translation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3159–3168, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Efficient Constituency Tree based Encoding for Natural Language to Bash Translation (Bharadwaj & Shevade, NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-main.230.pdf
Code
 shikhar-s/segmented-invocation-transformer
Data
NLC2CMD