Learning Latent Semantic Annotations for Grounding Natural Language to Structured Data

Guanghui Qin, Jin-Ge Yao, Xuening Wang, Jinpeng Wang, Chin-Yew Lin


Abstract
Previous work on grounded language learning did not fully capture the semantics underlying the correspondences between structured world state representations and texts, especially those between numerical values and lexical terms. In this paper, we attempt at learning explicit latent semantic annotations from paired structured tables and texts, establishing correspondences between various types of values and texts. We model the joint probability of data fields, texts, phrasal spans, and latent annotations with an adapted semi-hidden Markov model, and impose a soft statistical constraint to further improve the performance. As a by-product, we leverage the induced annotations to extract templates for language generation. Experimental results suggest the feasibility of the setting in this study, as well as the effectiveness of our proposed framework.
Anthology ID:
D18-1411
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
3761–3771
Language:
URL:
https://aclanthology.org/D18-1411
DOI:
10.18653/v1/D18-1411
Bibkey:
Cite (ACL):
Guanghui Qin, Jin-Ge Yao, Xuening Wang, Jinpeng Wang, and Chin-Yew Lin. 2018. Learning Latent Semantic Annotations for Grounding Natural Language to Structured Data. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3761–3771, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Learning Latent Semantic Annotations for Grounding Natural Language to Structured Data (Qin et al., EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/D18-1411.pdf
Attachment:
 D18-1411.Attachment.pdf
Video:
 https://vimeo.com/306117499
Code
 hiaoxui/D2T-Grounding
Data
RotoWire