Distilling weighted finite automata from arbitrary probabilistic models
Ananda
Theertha
Suresh
author
Brian
Roark
author
Michael
Riley
author
Vlad
Schogol
author
2019-09
text
Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing
Heiko
Vogler
editor
Andreas
Maletti
editor
Association for Computational Linguistics
Dresden, Germany
conference publication
Weighted finite automata (WFA) are often used to represent probabilistic models, such as n-gram language models, since they are efficient for recognition tasks in time and space. The probabilistic source to be represented as a WFA, however, may come in many forms. Given a generic probabilistic model over sequences, we propose an algorithm to approximate it as a weighted finite automaton such that the Kullback-Leibler divergence between the source model and the WFA target model is minimized. The proposed algorithm involves a counting step and a difference of convex optimization, both of which can be performed efficiently. We demonstrate the usefulness of our approach on some tasks including distilling n-gram models from neural models.
suresh-etal-2019-distilling
10.18653/v1/W19-3112
https://aclanthology.org/W19-3112
2019-09
87
97