Nakyeong Yang
2024
Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination
Nakyeong Yang
|
Taegwan Kang
|
Stanley Jungkyu Choi
|
Honglak Lee
|
Kyomin Jung
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Instruction-following language models often show undesirable biases. These undesirable biases may be accelerated in the real-world usage of language models, where a wide range of instructions is used through zero-shot example prompting. To solve this problem, we first define the bias neuron, which significantly affects biased outputs, and prove its existence empirically. Furthermore, we propose a novel and practical bias mitigation method, CRISPR, to eliminate bias neurons of language models in instruction-following settings. CRISPR automatically determines biased outputs and categorizes neurons that affect the biased outputs as bias neurons using an explainability method. Experimental results demonstrate the effectiveness of our method in mitigating biases under zero-shot instruction-following settings without losing the model’s task performance and existing knowledge. The experimental results reveal the generalizability of our method as it shows robustness under various instructions and datasets. Surprisingly, our method can mitigate the bias in language models by eliminating only a few neurons (at least three).
2023
Multi-View Zero-Shot Open Intent Induction from Dialogues: Multi Domain Batch and Proxy Gradient Transfer
Hyukhun Koh
|
Haesung Pyun
|
Nakyeong Yang
|
Kyomin Jung
Proceedings of The Eleventh Dialog System Technology Challenge
In Task Oriented Dialogue (TOD) system, detecting and inducing new intents are two main challenges to apply the system in the real world. In this paper, we suggest the semantic multiview model to resolve these two challenges: (1) SBERT for General Embedding (GE), (2) Multi Domain Batch (MDB) for dialogue domain knowledge, and (3) Proxy Gradient Transfer (PGT) for cluster-specialized semantic. MDB feeds diverse dialogue datasets to the model at once to tackle the multi-domain problem by learning the multiple domain knowledge. We introduce a novel method PGT, which employs the Siamese network to fine-tune the model with a clustering method directly. Our model can learn how to cluster dialogue utterances by using PGT. Experimental results demonstrate that our multi-view model with MDB and PGT significantly improves the Open Intent Induction performance compared to baseline systems.
Task-specific Compression for Multi-task Language Models using Attribution-based Pruning
Nakyeong Yang
|
Yunah Jang
|
Hwanhee Lee
|
Seohyeong Jeong
|
Kyomin Jung
Findings of the Association for Computational Linguistics: EACL 2023
Multi-task language models show outstanding performance for various natural language understanding tasks with only a single model. However, these language models inevitably utilize an unnecessarily large number of model parameters, even when used only for a specific task. In this paper, we propose a novel training-free compression method for multi-task language models using pruning method. Specifically, we use an attribution method to determine which neurons are essential for performing a specific task. We task-specifically prune unimportant neurons and leave only task-specific parameters. Furthermore, we extend our method to be applicable in both low-resource and unsupervised settings. Since our compression method is training-free, it uses little computing resources and does not update the pre-trained parameters of language models, reducing storage space usage. Experimental results on the six widely-used datasets show that our proposed pruning method significantly outperforms baseline pruning methods. In addition, we demonstrate that our method preserves performance even in an unseen domain setting.
Search
Fix data
Co-authors
- Kyomin Jung 3
- Stanley Jungkyu Choi 1
- Yunah Jang 1
- Seohyeong Jeong 1
- Taegwan Kang 1
- show all...