Xin Zhang

Other people with similar names: Xin Zhang, Xin Zhang, Xin Zhang, Xin Zhang, Xin Zhang, Xin Zhang, Xin Zhang

Unverified author pages with similar names: Xin Zhang

2026

Understanding Conflicts in Multi-Objective Alignment through Reward Consistency
Zhihao Xu | Yongqi Tong | Xin Zhang | Jun Zhou | Xiting Wang
Findings of the Association for Computational Linguistics: ACL 2026

Multi-objective preference alignment often faces alignment conflicts, where optimizing for one objective (e.g., helpfulness) degrades performance on others (e.g., harmlessness). While prior work focuses on algorithmic solutions, the intrinsic conflict within data and its theoretical impact on training remain underexplored. To bridge this gap, we introduce the principle of Reward Consistency (RC), a theory-grounded criterion that approximates the alignment conflicts via reward models. We prove that a sample mitigates conflicts if and only if it satisfies RC, thereby ensuring improvement across all objectives during optimization. Building on this, we propose Reward Consistency Sampling (RCS), an automated framework for constructing pairwise data that adheres to RC, supplemented by a relaxation strategy to enhance flexibility. Extensive experiments show that RCS brings significant and consistent performance gains, achieving an average improvement of 23.07% in both harmlessness and helpfulness during simultaneous optimization comparde to the vanilla dataset. Our data-centric approach is complementary to existing alignment algorithms and effective in both sequential and simultaneous optimization scenarios.

pdf bib abs

Accurate comprehension and controllable generation of emotion and rhetoric are pivotal for enhancing the reasoning capabilities of large language models (LLMs). Existing studies mostly rely on external optimizations, lacking in-depth exploration of internal representation mechanisms, thus failing to achieve fine-grained steering at the neuron level. A handful of works on neurons are confined to emotions, neglecting rhetoric neurons and their intrinsic connections. Traditional neuron masking also exhibits counterintuitive phenomena, making reliable verification of neuron functionality infeasible. To address these issues, we systematically investigate the neurons representation mechanisms and inherent associations of 6 emotion categories and 4 core rhetorical devices. We propose a neuron identification framework that integrates multi-dimensional screening, and design an adaptive masking method incorporating dynamic filtering, attenuation masking, and feedback optimization, enabling reliable functional validation of neuron functionality. Through neuron regulation, we achieve directed induction of non-target sentences and enhancement of emotion tasks via rhetoric neurons. Experiments on 5 commonly used datasets validate the effectiveness of our method, providing a novel paradigm for the fine-grained steering of emotion and rhetoric expressions in LLMs.

pdf bib abs

How to Train a Real-World Silicon Concierge? Internalizing Complex Business Workflow to Only OneModel
Yongqi Tong | Xiaoyun Feng | Lyuxin Xue | Jianshe Li | Xin Zhang | Jiang-Ming Yang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)

Traditional industrial agents rely on modular pipelines, including Router, Retriever, Planner, Executor, Responder, Reviewer and so on, which inevitably fracture into a labyrinth of ad-hoc patches, leading to cascading errors and high latency. We propose OneModel, an applicable paradigm shift from external workflows to internalized knowledge representation. Unlike modular systems that slice fluid user intents into static steps, OneModel consolidates complex business logic and SOPs directly into the model’s parameters.Through Continual Pre-training (CPT) and logic-compilation SFT, we transform fragmented business rules into the model’s intuitive reasoning within a unified attention space. Deployed in our global financial service system, OneModel effectively breaks the impossible triangle of latency, accuracy, and complexity. Online A/B testing demonstrates end-to-end latency reduction of more than 50% (18.7s → 8s) while the Intelligent Resolution Rate (IRR) jumps from 64.3% to 83.3%. The results demonstrate our paradigm OneModel effectively replaces brittle engineering logic with internalized cognitive intuition, offering a scalable and future-proof blueprint for transitioning industrial agents from complex, error-prone workflow to unified model architectures.

Co-authors

Fei Li 1

Venues

ACL2
Findings1

Fix author