Zhipeng Xie
2024
Discriminative Language Model as Semantic Consistency Scorer for Prompt-based Few-Shot Text Classification
Zhipeng Xie
|
Yahe Li
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
A successful prompt-based finetuning method should have three prerequisites: task compatibility, input compatibility, and evidence abundance. Bearing this belief in mind, this paper designs a novel prompt-based method (called DLM-SCS) for few-shot text classification, which utilizes the discriminative language model ELECTRA that is pretrained to distinguish whether a token is original or replaced. The method is built upon the intuitive idea that the prompt instantiated with the true label should have higher semantic consistency score than other prompts with false labels. Since a prompt usually consists of several components (or parts), its semantic consistency can be decomposed accordingly, which means each part can provide information for semantic consistency discrimination. The semantic consistency of each component is then computed by making use of the pretrained ELECTRA model, where no extra parameters get introduced. Extensive experiments have shown that our model outperforms several state-of-the-art prompt-based few-shot methods on 10 widely-used text classification tasks.
2021
A Mixture-of-Experts Model for Antonym-Synonym Discrimination
Zhipeng Xie
|
Nan Zeng
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Discrimination between antonyms and synonyms is an important and challenging NLP task. Antonyms and synonyms often share the same or similar contexts and thus are hard to make a distinction. This paper proposes two underlying hypotheses and employs the mixture-of-experts framework as a solution. It works on the basis of a divide-and-conquer strategy, where a number of localized experts focus on their own domains (or subspaces) to learn their specialties, and a gating mechanism determines the space partitioning and the expert mixture. Experimental results have shown that our method achieves the state-of-the-art performance on the task.
Effect Generation Based on Causal Reasoning
Feiteng Mu
|
Wenjie Li
|
Zhipeng Xie
Findings of the Association for Computational Linguistics: EMNLP 2021
Causal reasoning aims to predict the future scenarios that may be caused by the observed actions. However, existing causal reasoning methods deal with causalities on the word level. In this paper, we propose a novel event-level causal reasoning method and demonstrate its use in the task of effect generation. In particular, we structuralize the observed cause-effect event pairs into an event causality network, which describes causality dependencies. Given an input cause sentence, a causal subgraph is retrieved from the event causality network and is encoded with the graph attention mechanism, in order to support better reasoning of the potential effects. The most probable effect event is then selected from the causal subgraph and is used as guidance to generate an effect sentence. Experiments show that our method generates more reasonable effect sentences than various well-designed competitors.