%0 Conference Proceedings %T On Importance Sampling-Based Evaluation of Latent Language Models %A Logan IV, Robert L. %A Gardner, Matt %A Singh, Sameer %Y Jurafsky, Dan %Y Chai, Joyce %Y Schluter, Natalie %Y Tetreault, Joel %S Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics %D 2020 %8 July %I Association for Computational Linguistics %C Online %F logan-iv-etal-2020-importance %X Language models that use additional latent structures (e.g., syntax trees, coreference chains, knowledge graph links) provide several advantages over traditional language models. However, likelihood-based evaluation of these models is often intractable as it requires marginalizing over the latent space. Existing works avoid this issue by using importance sampling. Although this approach has asymptotic guarantees, analysis is rarely conducted on the effect of decisions such as sample size and choice of proposal distribution on the reported estimates. In this paper, we carry out this analysis for three models: RNNG, EntityNLM, and KGLM. In addition, we elucidate subtle differences in how importance sampling is applied in these works that can have substantial effects on the final estimates, as well as provide theoretical results which reinforce the validity of this technique. %R 10.18653/v1/2020.acl-main.196 %U https://aclanthology.org/2020.acl-main.196 %U https://doi.org/10.18653/v1/2020.acl-main.196 %P 2171-2176