2025
pdf
bib
abs
Improving Preference Alignment of LLM with Inference-Free Self-Refinement
Fukun Ma
|
Kaibin Tian
|
Jieting Xue
|
Xiaoyi Wang
|
Ye Ma
|
Quan Chen
|
Peng Jiang
|
Lijie Wen
Findings of the Association for Computational Linguistics: EMNLP 2025
Large language models (LLMs) develop the in-context learning capability through pretraining and instruction tuning, enabling task adaptation without parameter updates. Self-refinement is a manifestation of this capability, which allows LLMs to iteratively refine the output using self-generated feedback. However, empirical observations reveal Inference-Free Self-Refinement (IFSR) in preference alignment: LLMs generate preference-improved output via fixed instructions, requiring no specific feedback, even no initial responses. There are two key components of the IFSR in preference alignment. The refining instruction is a fixed instruction that constrains the output distribution from a preference-semantic perspective. During training, it facilitates joint learning of preference-related semantic representations and data distribution alignment. The pseudo reference response is constructed from paired preference data and serves as a demonstration to guide the output distribution. It mitigates off-policy distributional bias while enhancing token-level preference learning in training. Experiments across multiple datasets demonstrate that incorporating IFSR into preference alignment yields performance improvement over 10%. Further ablation studies reveal additional characteristics and potential principles of IFSR.
2022
pdf
bib
abs
PA Ph&Tech at SemEval-2022 Task 11: NER Task with Ensemble Embedding from Reinforcement Learning
Qizhi Lin
|
Changyu Hou
|
Xiaopeng Wang
|
Jun Wang
|
Yixuan Qiao
|
Peng Jiang
|
Xiandi Jiang
|
Benqi Wang
|
Qifeng Xiao
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
From pretrained contextual embedding to document-level embedding, the selection and construction of embedding have drawn more and more attention in the NER domain in recent research. This paper aims to discuss the performance of ensemble embeddings on complex NER tasks. Enlightened by Wang’s methodology, we try to replicate the dominating power of ensemble models with reinforcement learning optimizor on plain NER tasks to complex ones. Based on the composition of semeval dataset, the performance of the applied model is tested on lower-context, QA, and search query scenarios together with its zero-shot learning ability. Results show that with abundant training data, the model can achieve similar performance on lower-context cases compared to plain NER cases, but can barely transfer the performance to other scenarios in the test phase.
pdf
bib
abs
SFE-AI at SemEval-2022 Task 11: Low-Resource Named Entity Recognition using Large Pre-trained Language Models
Changyu Hou
|
Jun Wang
|
Yixuan Qiao
|
Peng Jiang
|
Peng Gao
|
Guotong Xie
|
Qizhi Lin
|
Xiaopeng Wang
|
Xiandi Jiang
|
Benqi Wang
|
Qifeng Xiao
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Large scale pre-training models have been widely used in named entity recognition (NER) tasks. However, model ensemble through parameter averaging or voting can not give full play to the differentiation advantages of different models, especially in the open domain. This paper describes our NER system in the SemEval 2022 task11: MultiCoNER. We proposed an effective system to adaptively ensemble pre-trained language models by a Transformer layer. By assigning different weights to each model for different inputs, we adopted the Transformer layer to integrate the advantages of diverse models effectively. Experimental results show that our method achieves superior performances in Farsi and Dutch.