%0 Conference Proceedings
%T Higher-order Derivatives of Weighted Finite-state Machines
%A Zmigrod, Ran
%A Vieira, Tim
%A Cotterell, Ryan
%S Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
%D 2021
%8 August
%I Association for Computational Linguistics
%C Online
%F zmigrod-etal-2021-higher
%X Weighted finite-state machines are a fundamental building block of NLP systems. They have withstood the test of time—from their early use in noisy channel models in the 1990s up to modern-day neurally parameterized conditional random fields. This work examines the computation of higher-order derivatives with respect to the normalization constant for weighted finite-state machines. We provide a general algorithm for evaluating derivatives of all orders, which has not been previously described in the literature. In the case of second-order derivatives, our scheme runs in the optimal O(A\² N\⁴) time where A is the alphabet size and N is the number of states. Our algorithm is significantly faster than prior algorithms. Additionally, our approach leads to a significantly faster algorithm for computing second-order expectations, such as covariance matrices and gradients of first-order expectations.
%R 10.18653/v1/2021.acl-short.32
%U https://aclanthology.org/2021.acl-short.32
%U https://doi.org/10.18653/v1/2021.acl-short.32
%P 240-248