Soundar Srinivasan
2020
Examination and Extension of Strategies for Improving Personalized Language Modeling via Interpolation
Liqun Shao
|
Sahitya Mantravadi
|
Tom Manzini
|
Alejandro Buendia
|
Manon Knoertzer
|
Soundar Srinivasan
|
Chris Quirk
Proceedings of the First Workshop on Natural Language Interfaces
In this paper, we detail novel strategies for interpolating personalized language models and methods to handle out-of-vocabulary (OOV) tokens to improve personalized language models. Using publicly available data from Reddit, we demonstrate improvements in offline metrics at the user level by interpolating a global LSTM-based authoring model with a user-personalized n-gram model. By optimizing this approach with a back-off to uniform OOV penalty and the interpolation coefficient, we observe that over 80% of users receive a lift in perplexity, with an average of 5.4% in perplexity lift per user. In doing this research we extend previous work in building NLIs and improve the robustness of metrics for downstream tasks.
Search
Co-authors
- Liqun Shao 1
- Sahitya Mantravadi 1
- Tom Manzini 1
- Alejandro Buendia 1
- Manon Knoertzer 1
- show all...
Venues
- nli1