Kechen Jiao
2025
IIET: Efficient Numerical Transformer via Implicit Iterative Euler Method
Xinyu Liu
|
Bei Li
|
Jiahao Liu
|
Junhao Ruan
|
Kechen Jiao
|
Hongyin Tang
|
Jingang Wang
|
Tong Xiao
|
JingBo Zhu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
High-order numerical methods enhance Transformer performance in tasks like NLP and CV, but introduce a performance-efficiency trade-off due to increased computational overhead. Our analysis reveals that conventional efficiency techniques, such as distillation, can be detrimental to the performance of these models, exemplified by PCformer. To explore more optimizable ODE-based Transformer architectures, we propose the Iterative Implicit Euler Transformer (IIET), which simplifies high-order methods using an iterative implicit Euler approach. This simplification not only leads to superior performance but also facilitates model compression compared to PCformer. To enhance inference efficiency, we introduce Iteration Influence-Aware Distillation (IIAD). Through a flexible threshold, IIAD allows users to effectively balance the performance-efficiency trade-off. On lm-evaluation-harness, IIET boosts average accuracy by 2.65% over vanilla Transformers and 0.8% over PCformer. Its efficient variant, E-IIET, significantly cuts inference overhead by 55% while retaining 99.4% of the original task accuracy. Moreover, the most efficient IIET variant achieves an average performance gain exceeding 1.6% over vanilla Transformer with comparable speed.
TCPO: Thought-Centric Preference Optimization for Effective Embodied Decision-making
Kechen Jiao
|
Zhirui Fang
|
Jiahao Liu
|
Bei Li
|
Qifan Wang
|
Xinyu Liu
|
Junhao Ruan
|
Zhongjian Qiao
|
Yifan Zhu
|
Yaxin Xu
|
Jingang Wang
|
Xiu Li
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Using effective generalization capabilities of vision language models (VLMs) in context-specific dynamic tasks for embodied artificial intelligence remains a significant challenge. Although supervised fine-tuned models can better align with the real physical world, they still exhibit sluggish responses and hallucination issues in dynamically changing environments, necessitating further alignment. Existing post-SFT methods, reliant on reinforcement learning and chain-of-thought (CoT) approaches, are constrained by sparse rewards and action-only optimization, resulting in low sample efficiency, poor consistency, and model degradation. To address these issues, this paper proposes Thought-Centric Preference Optimization (TCPO) for effective embodied decision-making. Specifically, TCPO introduces a stepwise preference-based optimization approach, transforming sparse reward signals into richer step sample pairs. It emphasizes the alignment of the model’s intermediate reasoning process, mitigating the problem of model degradation. Moreover, by incorporating Action Policy Consistency Constraint (APC), it further imposes consistency constraints on the model output. Experiments in the ALFWorld environment demonstrate an average success rate of **26.67%**, achieving a **6%** improvement over RL4VLM and validating the effectiveness of our approach in mitigating model degradation after fine-tuning. These results highlight the potential of integrating preference-based learning techniques with CoT processes to enhance the decision-making capabilities of vision-language models in embodied agents.
Search
Fix author
Co-authors
- Bei Li 2
- Xinyu Liu 2
- Jiahao Liu 2
- Junhao Ruan 2
- Jingang Wang 2
- show all...