Historical Text Normalization with Delayed Rewards

Simon Flachs, Marcel Bollmann, Anders Søgaard


Abstract
Training neural sequence-to-sequence models with simple token-level log-likelihood is now a standard approach to historical text normalization, albeit often outperformed by phrase-based models. Policy gradient training enables direct optimization for exact matches, and while the small datasets in historical text normalization are prohibitive of from-scratch reinforcement learning, we show that policy gradient fine-tuning leads to significant improvements across the board. Policy gradient training, in particular, leads to more accurate normalizations for long or unseen words.
Anthology ID:
P19-1157
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1614–1619
Language:
URL:
https://aclanthology.org/P19-1157/
DOI:
10.18653/v1/P19-1157
Bibkey:
Cite (ACL):
Simon Flachs, Marcel Bollmann, and Anders Søgaard. 2019. Historical Text Normalization with Delayed Rewards. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1614–1619, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Historical Text Normalization with Delayed Rewards (Flachs et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1157.pdf