Meng Gao


2024

pdf bib
From Discrimination to Generation: Low-Resource Intent Detection with Language Model Instruction Tuning
Feng Zhang | Wei Chen | Fei Ding | Meng Gao | Tengjiao Wang | Jiahui Yao | Jiabin Zheng
Findings of the Association for Computational Linguistics: ACL 2024

Intent detection aims to identify user goals from utterances, and is a ubiquitous step towards the satisfaction of user desired needs in many interaction systems. As dynamic and varied intents arise, models that are capable of identifying new intents promptly are required. However, existing studies usually fine-tune discriminative models on the specific defined intent classes, precluding them from being directly adopted to new intent domains. In this paper, we introduce a generative pre-trained intent model that can recognize new intents from different domains in low-resource scenarios. We reformulate intent detection into a generation task and design descriptive and regularized instructions to guide the model effectively to detect new intents in open domains with no parameter updates. To validate the proposed method, we introduce a new intent detection benchmark, including the Meta-Intent Dataset and three types of representative evaluation settings. We conduct extensive experiments which demonstrate that our method outperforms a range of strong baselines that needs further fine-tuning or domain-specific samples.

2019

pdf bib
Self-Adaptive Scaling for Learnable Residual Structure
Fenglin Liu | Meng Gao | Yuanxin Liu | Kai Lei
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Residual has been widely applied to build deep neural networks with enhanced feature propagation and improved accuracy. In the literature, multiple variants of residual structure are proposed. However, most of them are manually designed for particular tasks and datasets and the combination of existing residual structures has not been well studied. In this work, we propose the Self-Adaptive Scaling (SAS) approach that automatically learns the design of residual structure from data. The proposed approach makes the best of various residual structures, resulting in a general architecture covering several existing ones. In this manner, we construct a learnable residual structure which can be easily integrated into a wide range of residual-based models. We evaluate our approach on various tasks concerning different modalities, including machine translation (IWSLT-2015 EN-VI and WMT-2014 EN-DE, EN-FR), image classification (CIFAR-10 and CIFAR-100), and image captioning (MSCOCO). Empirical results show that the proposed approach consistently improves the residual-based models and exhibits desirable generalization ability. In particular, by incorporating the proposed approach to the Transformer model, we establish new state-of-the-arts on the IWSLT-2015 EN-VI low-resource machine translation dataset.