Mengyuan Li (李梦媛) - ACL Anthology

Mengyuan Li

Also published as: MengYuan Li, 梦媛李

2024

pdf bib abs
基于逻辑推理和多任务融合的认知刺激对话生成方法(Cognitive stimulation dialogue generation method based on logical reasoning and multi-task integration)
Yuru Jiang (蒋玉茹) | Mengyuan Li (李梦媛) | Yuyang Tao (陶宇阳) | Keming Qu (区可明) | Zepeng She (佘泽鹏) | Shuicai Shi (施水才)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“在全球老龄化背景下,带有认知刺激的对话系统是保持老年人认知健康的重要手段。中文认知刺激对话数据集(Chinese Cognitive Stimulation Conversation Dataset,CSConv)和模型构建的研究工作刚刚开始。本文将认知刺激对话生成视为一个多任务融合的逻辑思维推理过程,将情感分类任务、决策任务和对话回复生成任务间的逻辑关系,建模为一个推理过程,来引导大语言模型生成。针对决策任务,本文提出分层编码器结构的决策模型。决策实验结果表明,决策模型有效的提高了决策任务的准确率。针对多任务过程,本文提出多任务融合方法,将三个任务对应的模型结合在一起。生成实验结果表明,分类、决策及生成的多任务融合方法,显著提升了对话回复能力,证明了该方法的有效性和先进性。”

2020

pdf bib abs
Towards Non-task-specific Distillation of BERT via Sentence Representation Approximation
Bowen Wu | Huan Zhang | MengYuan Li | Zongsheng Wang | Qihang Feng | Junhong Huang | Baoxun Wang
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Recently, BERT has become an essential ingredient of various NLP deep models due to its effectiveness and universal-usability. However, the online deployment of BERT is often blocked by its large-scale parameters and high computational cost. There are plenty of studies showing that the knowledge distillation is efficient in transferring the knowledge from BERT into the model with a smaller size of parameters. Nevertheless, current BERT distillation approaches mainly focus on task-specified distillation, such methodologies lead to the loss of the general semantic knowledge of BERT for universal-usability. In this paper, we propose a sentence representation approximating oriented distillation framework that can distill the pre-trained BERT into a simple LSTM based model without specifying tasks. Consistent with BERT, our distilled model is able to perform transfer learning via fine-tuning to adapt to any sentence-level downstream task. Besides, our model can further cooperate with task-specific distillation procedures. The experimental results on multiple NLP tasks from the GLUE benchmark show that our approach outperforms other task-specific distillation methods or even much larger models, i.e., ELMO, with efficiency well-improved.

Leveraging persona information of users in Neural Response Generators (NRG) to perform personalized conversations has been considered as an attractive and important topic in the research of conversational agents over the past few years. Despite of the promising progress achieved by recent studies in this field, persona information tends to be incorporated into neural networks in the form of user embeddings, with the expectation that the persona can be involved via End-to-End learning. This paper proposes to adopt the personality-related characteristics of human conversations into variational response generators, by designing a specific conditional variational autoencoder based deep model with two new regularization terms employed to the loss function, so as to guide the optimization towards the direction of generating both persona-aware and relevant responses. Besides, to reasonably evaluate the performances of various persona modeling approaches, this paper further presents three direct persona-oriented metrics from different perspectives. The experimental results have shown that our proposed methodology can notably improve the performance of persona-aware response generation, and the metrics are reasonable to evaluate the results.

Co-authors

Venues

Fix author