Shengzhe Li


2023

pdf bib
Bridging the Gap between Subword and Character Segmentation in Pretrained Language Models
Shun Kiyono | Sho Takase | Shengzhe Li | Toshinori Sato
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

Pretrained language models require the use of consistent segmentation (e.g., subword- or character-level segmentation) in pretraining and finetuning. In NLP, many tasks are modeled by subword-level segmentation better than by character-level segmentation. However, because of their format, several tasks require the use of character-level segmentation. Thus, in order to tackle both types of NLP tasks, language models must be independently pretrained for both subword and character-level segmentation. However, this is an inefficient and costly procedure. Instead, this paper proposes a method for training a language model with unified segmentation. This means that the trained model can be finetuned on both subword- and character-level segmentation. The principle of the method is to apply the subword regularization technique to generate a mixture of subword- and character-level segmentation. Through experiment on BERT models, we demonstrate that our method can halve the computational cost of pretraining.

2022

pdf bib
Building a Personalized Dialogue System with Prompt-Tuning
Tomohito Kasahara | Daisuke Kawahara | Nguyen Tung | Shengzhe Li | Kenta Shinzato | Toshinori Sato
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop

Dialogue systems without consistent responses are not attractive. In this study, we build a dialogue system that can respond based on a given character setting (persona) to bring consistency. Considering the trend of the rapidly increasing scale of language models, we propose an approach that uses prompt-tuning, which has low learning costs, on pre-trained large-scale language models. The results of the automatic and manual evaluations in English and Japanese show that it is possible to build a dialogue system with more natural and personalized responses with less computational resources than fine-tuning.