Alejandro Buendia


2020

pdf bib
Examination and Extension of Strategies for Improving Personalized Language Modeling via Interpolation
Liqun Shao | Sahitya Mantravadi | Tom Manzini | Alejandro Buendia | Manon Knoertzer | Soundar Srinivasan | Chris Quirk
Proceedings of the First Workshop on Natural Language Interfaces

In this paper, we detail novel strategies for interpolating personalized language models and methods to handle out-of-vocabulary (OOV) tokens to improve personalized language models. Using publicly available data from Reddit, we demonstrate improvements in offline metrics at the user level by interpolating a global LSTM-based authoring model with a user-personalized n-gram model. By optimizing this approach with a back-off to uniform OOV penalty and the interpolation coefficient, we observe that over 80% of users receive a lift in perplexity, with an average of 5.4% in perplexity lift per user. In doing this research we extend previous work in building NLIs and improve the robustness of metrics for downstream tasks.