Haoran Que
2026
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values
Siwei Wu | JinCheng Ren | Xeron Du | Shuyue Guo | Xingwei Qu | Yiming Liang | Jie Liu | Yunwen Li | Tyler Loakman | Tianyu Zheng | Boyu Feng | Huaqing Yuan | Zili Wang | Jiaheng Liu | Wenhao Huang | Chenglin Cai | Haoran Que | Jian Yang | Yuelin Bai | Zekun Moore Wang | Zhouliang Yu | Qunshu Lin | Ding Pan | Yuchen Eleanor Jiang | Tiannan Wang | Wangchunshu Zhou | Shenzhi Wang | Xingyuan Bu | Minghao Liu | Guoyin Wang | Ge Zhang | Chenghua Lin
Findings of the Association for Computational Linguistics: EACL 2026
Siwei Wu | JinCheng Ren | Xeron Du | Shuyue Guo | Xingwei Qu | Yiming Liang | Jie Liu | Yunwen Li | Tyler Loakman | Tianyu Zheng | Boyu Feng | Huaqing Yuan | Zili Wang | Jiaheng Liu | Wenhao Huang | Chenglin Cai | Haoran Que | Jian Yang | Yuelin Bai | Zekun Moore Wang | Zhouliang Yu | Qunshu Lin | Ding Pan | Yuchen Eleanor Jiang | Tiannan Wang | Wangchunshu Zhou | Shenzhi Wang | Xingyuan Bu | Minghao Liu | Guoyin Wang | Ge Zhang | Chenghua Lin
Findings of the Association for Computational Linguistics: EACL 2026
Existing Chinese preference datasets suffer from limited scale, restricted domain coverage, and insufficiently rigorous data validation. Human annotation significantly limits the scalability of human preference datasets. As a result, Chinese Alignment and Chinese Reward Models (CRM) have not yet been thoroughly explored. To address these challenges, we design an LLM-based data annotation pipeline with no human intervention. Based on this pipeline, we curate COIG-P (Chinese Open Instruction Generalist - Preference), a high-quality, large-scale Chinese preference dataset consisting of 1M Chinese preference pairs and 92k carefully curated Chinese queries across diverse domains, including Chat, Coding, Maths, and others. We conduct experiments to verify the quality of COIG-P from two perspectives. (1) COIG-P brings significant performance improvements for the Qwen2/2.5 and Infinity-Instruct model series on AlignBench through DPO, with gains ranging from 2% to 12%. Furthermore, it significantly outperforms other existing Chinese preference datasets. (2) We train an 8B-sized CRM and manually annotate a Chinese Reward Benchmark (CRBench). Our CRM demonstrates robust scoring ability on CRBench. In addition, in practical data construction experiments, the quality of the data constructed by our CRM is comparable to that produced by GPT-4o.
2025
PIC: Unlocking Long-Form Text Generation Capabilities of Large Language Models via Position ID Compression
Haoran Que | Wenge Rong
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Haoran Que | Wenge Rong
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Long-context understanding is crucial for large language models (LLMs) and has become a fundamental capability for most LLMs. However, beyond the focus on “input-long”, the ability to “output-long” is equally significant, yet it remains underexplored. To address this limitation, we propose a simple, efficient, and plug-in approach, Position ID Compression (PIC), to unlock the long-form text generation potential of LLMs. The idea is straightforward: by compressing the position ids of the context, we provoke and guide LLMs to generate coherent and longer output. Specifically, we find that directly reducing the position ids by a fixed ratio significantly impacts the generation quality. To mitigate this, we propose two variants of PIC: NTK-aware PIC and Dynamic PIC. Without additional training, both methods enable LLMs to extend their generation length by approximately 1.5 times without compromising generation quality. Furthermore, by integrating supervised fine-tuning (SFT) with PIC, we propose PIC-SFT, which further improves LLMs’ long-form text generation capabilities, achieving top performance on HelloBench and LongBench-Write. Extensive experiments demonstrate the effectiveness of our approach.
MIO: A Foundation Model on Multimodal Tokens
Zekun Moore Wang | King Zhu | Chunpu Xu | Wangchunshu Zhou | Jiaheng Liu | Yibo Zhang | Jessie Wang | Ning Shi | Siyu Li | Yizhi Li | Haoran Que | Zhaoxiang Zhang | Yuanxing Zhang | Ge Zhang | Ke Xu | Jie Fu | Wenhao Huang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Zekun Moore Wang | King Zhu | Chunpu Xu | Wangchunshu Zhou | Jiaheng Liu | Yibo Zhang | Jessie Wang | Ning Shi | Siyu Li | Yizhi Li | Haoran Que | Zhaoxiang Zhang | Yuanxing Zhang | Ge Zhang | Ke Xu | Jie Fu | Wenhao Huang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
In this paper, we introduce MIO, a novel foundation model built on multimodal tokens, capable of understanding and generating speech, text, images, and videos in an end-to-end, autoregressive manner. While the emergence of large language models (LLMs) and multimodal large language models (MM-LLMs) propels advancements in artificial general intelligence through their versatile capabilities, they still lack true any-to-any understanding and generation. Recently, the release of GPT-4o has showcased the remarkable potential of any-to-any LLMs for complex real-world tasks, enabling omnidirectional input and output across images, speech, and text. However, it is closed-source and does not support the generation of multimodal interleaved sequences. To address this gap, we present MIO, which is trained on a mixture of discrete tokens across four modalities using causal multimodal modeling. MIO undergoes a four-stage training process: (1) alignment pre-training, (2) interleaved pre-training, (3) speech-enhanced pre-training, and (4) comprehensive supervised fine-tuning on diverse textual, visual, and speech tasks. Our experimental results indicate that MIO exhibits competitive, and in some cases superior, performance compared to previous dual-modal baselines, any-to-any model baselines, and even modality-specific baselines. Moreover, MIO demonstrates advanced capabilities inherent to its any-to-any feature, such as interleaved video-text generation, chain-of-visual-thought reasoning, visual guideline generation, instructional image editing, etc.
2024
E2-LLM: Efficient and Extreme Length Extension of Large Language Models
Jiaheng Liu | Zhiqi Bai | Yuanxing Zhang | Chenchen Zhang | Yu Zhang | Ge Zhang | Jiakai Wang | Haoran Que | Yukang Chen | Wenbo Su | Tiezheng Ge | Jie Fu | Wenhu Chen | Bo Zheng
Findings of the Association for Computational Linguistics: ACL 2024
Jiaheng Liu | Zhiqi Bai | Yuanxing Zhang | Chenchen Zhang | Yu Zhang | Ge Zhang | Jiakai Wang | Haoran Que | Yukang Chen | Wenbo Su | Tiezheng Ge | Jie Fu | Wenhu Chen | Bo Zheng
Findings of the Association for Computational Linguistics: ACL 2024
Training Large Language Models (LLMs) to process extensive context lengths incurs prohibitive computational costs. Prevailing techniques for extending context capabilities in LLMs typically require not only additional training procedures but also access to datasets with long context (e.g., sequences of 32K tokens), presupposing substantial GPU expenditures. To address the aforementioned issues, we introduce a novel solution named Efficient and Extreme length extension for Large Language Models (E2-LLM). E2-LLM entails a singular training process over considerably short sequences (e.g., 4K tokens), which greatly mitigates the cost of continual-pretraining or fine-tuning. Within the training phase, we incorporate a dual augmentation strategy with Rotary Position Embeddings (RoPE) that adjusts the scale and position indices across distinct training samples. E 2 -LLM is meticulously designed to enhance the model’s robustness to diverse relative positions. The experimental results on multiple benchmark datasets demonstrate the superior performance of E 2 -LLM on demanding tasks of processing long contexts.
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models
Noah Wang | Z.y. Peng | Haoran Que | Jiaheng Liu | Wangchunshu Zhou | Yuhan Wu | Hongcheng Guo | Ruitong Gan | Zehao Ni | Jian Yang | Man Zhang | Zhaoxiang Zhang | Wanli Ouyang | Ke Xu | Wenhao Huang | Jie Fu | Junran Peng
Findings of the Association for Computational Linguistics: ACL 2024
Noah Wang | Z.y. Peng | Haoran Que | Jiaheng Liu | Wangchunshu Zhou | Yuhan Wu | Hongcheng Guo | Ruitong Gan | Zehao Ni | Jian Yang | Man Zhang | Zhaoxiang Zhang | Wanli Ouyang | Ke Xu | Wenhao Huang | Jie Fu | Junran Peng
Findings of the Association for Computational Linguistics: ACL 2024
The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters. However, the closed-source nature of state-of-the-art LLMs and their general-purpose training limit role-playing optimization. In this paper, we introduce RoleLLM, a framework to benchmark, elicit, and enhance role-playing abilities in LLMs. RoleLLM comprises four stages: (1) Role Profile Construction for 100 roles; (2) Context-Based Instruction Generation (Context-Instruct) for role-specific knowledge extraction; (3) Role Prompting using GPT (RoleGPT) for speaking style imitation; and (4) Role-Conditioned Instruction Tuning (RoCIT) for fine-tuning open-source models along with role customization. By Context-Instruct and RoleGPT, we create RoleBench, the first systematic and fine-grained character-level benchmark dataset for role-playing with 168,093 samples. Moreover, RoCIT on RoleBench yields RoleLLaMA (English) and RoleGLM (Chinese), significantly enhancing role-playing abilities and even achieving comparable results with RoleGPT (using GPT-4).
Search
Fix author
Co-authors
- Jiaheng Liu 4
- Jie Fu 3
- Wenhao Huang 3
- Ge Zhang 3
- Wangchunshu Zhou 3
- Zekun Moore Wang 2
- Jian Yang 2
- Yuanxing Zhang 2
- Zhaoxiang Zhang 2
- Zhiqi Bai 1
- Yuelin Bai 1
- Xingyuan Bu 1
- Chenglin Cai 1
- Yukang Chen 1
- Wenhu Chen 1
- Xeron Du 1
- Boyu Feng 1
- Ruitong Gan 1
- Tiezheng Ge 1
- Hongcheng Guo 1
- Shuyue Guo 1
- Yuchen Eleanor Jiang 1
- Siyu Li 1
- Yizhi Li 1
- Yunwen Li 1
- Yiming Liang 1
- Qunshu Lin 1
- Chenghua Lin 1
- Jie Liu 1
- Minghao Liu 1
- Tyler Loakman 1
- Zehao Ni 1
- Wanli Ouyang 1
- Ding Pan 1
- Z.y. Peng 1
- Junran Peng 1
- Xingwei Qu 1
- JinCheng Ren 1
- Wenge Rong 1
- Ning Shi 1
- Wenbo Su 1
- Jiakai Wang 1
- Noah Wang 1
- Jessie Wang 1
- Zili Wang 1
- Tiannan Wang 1
- Shenzhi Wang 1
- Guoyin Wang 1
- Yuhan Wu 1
- Siwei Wu 1
- Ke Xu 1
- Chunpu Xu 1
- Ke Xu 1
- Zhouliang Yu 1
- Huaqing Yuan 1
- Chenchen Zhang 1
- Yu Zhang 1
- Man Zhang 1
- Yibo Zhang 1
- Bo Zheng 1
- Tianyu Zheng 1
- King Zhu 1