Zixiao Zhu

2025

pdf bib abs
Logit Separability-Driven Samples and Multiple Class-Related Words Selection for Advancing In-Context Learning
Zixiao Zhu | Zijian Feng | Hanzhang Zhou | Junlang Qian | Kezhi Mao
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Effective organization of in-context learning (ICL) demonstrations is key to improving the quality of large language model (LLM) responses. To create better sample-label pairs that instruct LLM understanding, we introduce logit separability, a criterion to assess the clarity of both samples and class-related words at the logit level. This facilitates the optimization of sample and label selection, enhancing the precision of information provided in ICL demonstrations. Additionally, we find that incorporating multiple class-related words for each sample, rather than relying on a single class name, improves performance by offering a broader range of label information. Building on these insights, we propose LICL, a logit separability-based method that jointly organizes samples and integrates multiple class-related words into each sample-label pair. Evaluations across seven classification datasets show that this approach significantly improves ICL performance by providing clearer instructions and richer label information.

pdf bib
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction
Junlang Qian | Zixiao Zhu | Hanzhang Zhou | Zijian Feng | Zepeng Zhai | Kezhi Mao
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

2024

pdf bib abs
FreeCtrl: Constructing Control Centers with Feedforward Layers for Learning-Free Controllable Text Generation
Zijian Feng | Hanzhang Zhou | Kezhi Mao | Zixiao Zhu
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Controllable text generation (CTG) seeks to craft texts adhering to specific attributes, traditionally employing learning-based techniques such as training, fine-tuning, or prefix-tuning with attribute-specific datasets. These approaches, while effective, demand extensive computational and data resources. In contrast, some proposed learning-free alternatives circumvent learning but often yield inferior results, exemplifying the fundamental machine learning trade-off between computational expense and model efficacy. To overcome these limitations, we propose FreeCtrl, a learning-free approach that dynamically adjusts the weights of selected feedforward neural network (FFN) vectors to steer the outputs of large language models (LLMs). FreeCtrl hinges on the principle that the weights of different FFN vectors influence the likelihood of different tokens appearing in the output. By identifying and adaptively adjusting the weights of attribute-related FFN vectors, FreeCtrl can control the output likelihood of attribute keywords in the generated content. Extensive experiments on single- and multi-attribute control reveal that the learning-free FreeCtrl outperforms other learning-free and learning-based methods, successfully resolving the dilemma between learning costs and model performance.

pdf bib abs
LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting Strategy for Document-Level Event Argument Extraction
Hanzhang Zhou | Junlang Qian | Zijian Feng | Lu Hui | Zixiao Zhu | Kezhi Mao
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this study, we explore in-context learning (ICL) in document-level event argument extraction (EAE) to alleviate the dependency on large-scale labeled data for this task. We introduce the Heuristic-Driven Link-of-Analogy (HD-LoA) prompting tailored for the EAE task. Specifically, we hypothesize and validate that LLMs learn task-specific heuristics from demonstrations in ICL. Building upon this hypothesis, we introduce an explicit heuristic-driven demonstration construction approach, which transforms the haphazard example selection process into a systematic method that emphasizes task heuristics. Additionally, inspired by the analogical reasoning of human, we propose the link-of-analogy prompting, which enables LLMs to process new situations by drawing analogies to known situations, enhancing their performance on unseen classes beyond limited ICL examples. Experiments show that our method outperforms existing prompting methods and few-shot supervised learning methods on document-level EAE datasets. Additionally, the HD-LoA prompting shows effectiveness in other tasks like sentiment analysis and natural language inference, demonstrating its broad adaptability.

pdf bib abs
PromptExplainer: Explaining Language Models through Prompt-based Learning
Zijian Feng | Hanzhang Zhou | Zixiao Zhu | Kezhi Mao
Findings of the Association for Computational Linguistics: EACL 2024

Pretrained language models have become workhorses for various natural language processing (NLP) tasks, sparking a growing demand for enhanced interpretability and transparency. However, prevailing explanation methods, such as attention-based and gradient-based strategies, largely rely on linear approximations, potentially causing inaccuracies such as accentuating irrelevant input tokens. To mitigate the issue, we develop PromptExplainer, a novel method for explaining language models through prompt-based learning. PromptExplainer aligns the explanation process with the masked language modeling (MLM) task of pretrained language models and leverages the prompt-based learning framework for explanation generation. It disentangles token representations into the explainable embedding space using the MLM head and extracts discriminative features with a verbalizer to generate class-dependent explanations. Extensive experiments demonstrate that PromptExplainer significantly outperforms state-of-the-art explanation methods.

pdf bib abs
EDEntail: An Entailment-based Few-shot Text Classification with Extensional Definition
Zixiao Zhu | Junlang Qian | Zijian Feng | Hanzhang Zhou | Kezhi Mao
Findings of the Association for Computational Linguistics: NAACL 2024

Few-shot text classification has seen significant advancements, particularly with entailment-based methods, which typically use either class labels or intensional definitions of class labels in hypotheses for label semantics expression. In this paper, we propose EDEntail, a method that employs extensional definition (EDef) of class labels in hypotheses, aiming to express the semantics of class labels more explicitly. To achieve the above goal, we develop an algorithm to gather and select extensional descriptive words of class labels and then order and format them into a sequence to form hypotheses. Our method has been evaluated and compared with state-of-the-art models on five classification datasets. The results demonstrate that our approach surpasses the supervised-learning methods and prompt-based methods under the few-shot setting, which underlines the potential of using an extensional definition of class labels for entailment-based few-shot text classification. Our code is available at https://github.com/MidiyaZhu/EDEntail.

Co-authors

Zepeng Zhai 1

Venues

Fix author