Learning Semantic Correspondences in Technical Documentation

Kyle Richardson, Jonas Kuhn


Abstract
We consider the problem of translating high-level textual descriptions to formal representations in technical documentation as part of an effort to model the meaning of such documentation. We focus specifically on the problem of learning translational correspondences between text descriptions and grounded representations in the target documentation, such as formal representation of functions or code templates. Our approach exploits the parallel nature of such documentation, or the tight coupling between high-level text and the low-level representations we aim to learn. Data is collected by mining technical documents for such parallel text-representation pairs, which we use to train a simple semantic parsing model. We report new baseline results on sixteen novel datasets, including the standard library documentation for nine popular programming languages across seven natural languages, and a small collection of Unix utility manuals.
Anthology ID:
P17-1148
Volume:
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2017
Address:
Vancouver, Canada
Editors:
Regina Barzilay, Min-Yen Kan
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1612–1622
Language:
URL:
https://aclanthology.org/P17-1148
DOI:
10.18653/v1/P17-1148
Bibkey:
Cite (ACL):
Kyle Richardson and Jonas Kuhn. 2017. Learning Semantic Correspondences in Technical Documentation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1612–1622, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
Learning Semantic Correspondences in Technical Documentation (Richardson & Kuhn, ACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/P17-1148.pdf
Note:
 P17-1148.Notes.pdf