Zhuoren Jiang


2024

pdf bib
Recurrent Alignment with Hard Attention for Hierarchical Text Rating
Chenxi Lin | Ren Jiayu | Guoxiu He | Zhuoren Jiang | Haiyan Yu | Xiaomin Zhu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

While large language models (LLMs) excel at understanding and generating plain text, they are not tailored to handle hierarchical text structures or directly predict task-specific properties such as text rating. In fact, selectively and repeatedly grasping the hierarchical structure of large-scale text is pivotal for deciphering its essence. To this end, we propose a novel framework for hierarchical text rating utilizing LLMs, which incorporates Recurrent Alignment with Hard Attention (RAHA). Particularly, hard attention mechanism prompts a frozen LLM to selectively focus on pertinent leaf texts associated with the root text and generate symbolic representations of their relationships. Inspired by the gradual stabilization of the Markov Chain, recurrent alignment strategy involves feeding predicted ratings iteratively back into the prompts of another trainable LLM, aligning it to progressively approximate the desired target. Experimental results demonstrate that RAHA outperforms existing state-of-the-art methods on three hierarchical text rating datasets. Theoretical and empirical analysis confirms RAHA’s ability to gradually converge towards the underlying target through multiple inferences. Additional experiments on plain text rating datasets verify the effectiveness of this Markov-like alignment. Our data and code can be available in https://github.com/ECNU-Text-Computing/Markov-LLM.

pdf bib
Can Large Language Models Grasp Legal Theories? Enhance Legal Reasoning with Insights from Multi-Agent Collaboration
Weikang Yuan | Junjie Cao | Zhuoren Jiang | Yangyang Kang | Jun Lin | Kaisong Song | Tianqianjin Lin | Pengwei Yan | Changlong Sun | Xiaozhong Liu
Findings of the Association for Computational Linguistics: EMNLP 2024

Large Language Models (LLMs) could struggle to fully understand legal theories and perform complex legal reasoning tasks. In this study, we introduce a challenging task (confusing charge prediction) to better evaluate LLMs’ understanding of legal theories and reasoning capabilities. We also propose a novel framework: Multi-Agent framework for improving complex Legal Reasoning capability (MALR). MALR employs non-parametric learning, encouraging LLMs to automatically decompose complex legal tasks and mimic human learning process to extract insights from legal rules, helping LLMs better understand legal theories and enhance their legal reasoning abilities. Extensive experiments on multiple real-world datasets demonstrate that the proposed framework effectively addresses complex reasoning issues in practical scenarios, paving the way for more reliable applications in the legal domain.

pdf bib
Evolving Knowledge Distillation with Large Language Models and Active Learning
Chengyuan Liu | Fubang Zhao | Kun Kuang | Yangyang Kang | Zhuoren Jiang | Changlong Sun | Fei Wu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Large language models (LLMs) have demonstrated remarkable capabilities across various NLP tasks. However, their computational costs are prohibitively high. To address this issue, previous research has attempted to distill the knowledge of LLMs into smaller models by generating annotated data. Nonetheless, these works have mainly focused on the direct use of LLMs for text generation and labeling, without fully exploring their potential to comprehend the target task and acquire valuable knowledge. In this paper, we propose EvoKD: Evolving Knowledge Distillation, which leverages the concept of active learning to interactively enhance the process of data generation using large language models, simultaneously improving the task capabilities of small domain model (student model). Different from previous work, we actively analyze the student model’s weaknesses, and then synthesize labeled samples based on the analysis. In addition, we provide iterative feedback to the LLMs regarding the student model’s performance to continuously construct diversified and challenging samples. Experiments and analysis on different NLP tasks, namely, text classification and named entity recognition show the effectiveness of EvoKD.

2021

pdf bib
Adjacency List Oriented Relational Fact Extraction via Adaptive Multi-task Learning
Fubang Zhao | Zhuoren Jiang | Yangyang Kang | Changlong Sun | Xiaozhong Liu
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
A Role-Selected Sharing Network for Joint Machine-Human Chatting Handoff and Service Satisfaction Analysis
Jiawei Liu | Kaisong Song | Yangyang Kang | Guoxiu He | Zhuoren Jiang | Changlong Sun | Wei Lu | Xiaozhong Liu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Chatbot is increasingly thriving in different domains, however, because of unexpected discourse complexity and training data sparseness, its potential distrust hatches vital apprehension. Recently, Machine-Human Chatting Handoff (MHCH), predicting chatbot failure and enabling human-algorithm collaboration to enhance chatbot quality, has attracted increasing attention from industry and academia. In this study, we propose a novel model, Role-Selected Sharing Network (RSSN), which integrates both dialogue satisfaction estimation and handoff prediction in one multi-task learning framework. Unlike prior efforts in dialog mining, by utilizing local user satisfaction as a bridge, global satisfaction detector and handoff predictor can effectively exchange critical information. Specifically, we decouple the relation and interaction between the two tasks by the role information after the shared encoder. Extensive experiments on two public datasets demonstrate the effectiveness of our model.

2020

pdf bib
Camouflaged Chinese Spam Content Detection with Semi-supervised Generative Active Learning
Zhuoren Jiang | Zhe Gao | Yu Duan | Yangyang Kang | Changlong Sun | Qiong Zhang | Xiaozhong Liu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We propose a Semi-supervIsed GeNerative Active Learning (SIGNAL) model to address the imbalance, efficiency, and text camouflage problems of Chinese text spam detection task. A “self-diversity” criterion is proposed for measuring the “worthiness” of a candidate for annotation. A semi-supervised variational autoencoder with masked attention learning approach and a character variation graph-enhanced augmentation procedure are proposed for data augmentation. The preliminary experiment demonstrates the proposed SIGNAL model is not only sensitive to spam sample selection, but also can improve the performance of a series of conventional active learning models for Chinese spam detection task. To the best of our knowledge, this is the first work to integrate active learning and semi-supervised generative learning for text spam detection.

2019

pdf bib
Detect Camouflaged Spam Content via StoneSkipping: Graph and Text Joint Embedding for Chinese Character Variation Representation
Zhuoren Jiang | Zhe Gao | Guoxiu He | Yangyang Kang | Changlong Sun | Qiong Zhang | Luo Si | Xiaozhong Liu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The task of Chinese text spam detection is very challenging due to both glyph and phonetic variations of Chinese characters. This paper proposes a novel framework to jointly model Chinese variational, semantic, and contextualized representations for Chinese text spam detection task. In particular, a Variation Family-enhanced Graph Embedding (VFGE) algorithm is designed based on a Chinese character variation graph. The VFGE can learn both the graph embeddings of the Chinese characters (local) and the latent variation families (global). Furthermore, an enhanced bidirectional language model, with a combination gate function and an aggregation learning function, is proposed to integrate the graph and text information while capturing the sequential information. Extensive experiments have been conducted on both SMS and review datasets, to show the proposed method outperforms a series of state-of-the-art models for Chinese spam detection.