Liqun Shao
2023
Logical Transformers: Infusing Logical Structures into Pre-Trained Language Models
Borui Wang
|
Qiuyuan Huang
|
Budhaditya Deb
|
Aaron Halfaker
|
Liqun Shao
|
Daniel McDuff
|
Ahmed Hassan Awadallah
|
Dragomir Radev
|
Jianfeng Gao
Findings of the Association for Computational Linguistics: ACL 2023
Natural language contains rich logical structures and logical information, and correctly detecting and accurately understanding these logical structures and information underlying natural language texts is very crucial for NLP models’ performance on many important NLU and NLG tasks. Existing pre-trained language models based on the transformer architecture mostly adopt a classical design for constructing their input embeddings that ignores the logical structures underlying natural language texts, thus limiting their ability to better capture and encode key logical information in the input sequences. To overcome such limitations, in this paper we first propose a novel approach to construct logic-aware input embeddings for transformer language models through a combination of logic detection, logic mapping and hierarchical logical projections, and then develop a corresponding new modeling paradigm that can upgrade existing transformer language models into logical transformers to boost their performance on different NLU and NLG tasks. Our empirical experiments on four important and challenging NLU and NLG tasks demonstrate that our proposed logical transformer language models can achieve superior performance over their baseline transformer models through a deeper understanding of the logical structures of texts.
2020
Examination and Extension of Strategies for Improving Personalized Language Modeling via Interpolation
Liqun Shao
|
Sahitya Mantravadi
|
Tom Manzini
|
Alejandro Buendia
|
Manon Knoertzer
|
Soundar Srinivasan
|
Chris Quirk
Proceedings of the First Workshop on Natural Language Interfaces
In this paper, we detail novel strategies for interpolating personalized language models and methods to handle out-of-vocabulary (OOV) tokens to improve personalized language models. Using publicly available data from Reddit, we demonstrate improvements in offline metrics at the user level by interpolating a global LSTM-based authoring model with a user-personalized n-gram model. By optimizing this approach with a back-off to uniform OOV penalty and the interpolation coefficient, we observe that over 80% of users receive a lift in perplexity, with an average of 5.4% in perplexity lift per user. In doing this research we extend previous work in building NLIs and improve the robustness of metrics for downstream tasks.
Search
Fix data
Co-authors
- Alejandro Buendia 1
- Budhaditya Deb 1
- Jianfeng Gao 1
- Aaron Halfaker 1
- Ahmed Hassan 1
- show all...