Jianbo Yuan
2024
InfiMM: Advancing Multimodal Understanding with an Open-Sourced Visual Language Model
Haogeng Liu
|
Quanzeng You
|
Yiqi Wang
|
Xiaotian Han
|
Bohan Zhai
|
Yongfei Liu
|
Wentao Chen
|
Yiren Jian
|
Yunzhe Tao
|
Jianbo Yuan
|
Ran He
|
Hongxia Yang
Findings of the Association for Computational Linguistics: ACL 2024
In this work, we present InfiMM, an advanced Multimodal Large Language Model that adapts to intricate vision-language tasks. InfiMM, inspired by the Flamingo architecture, distinguishes itself through the utilization of large-scale training data, comprehensive training strategies, and diverse large language models. This approach ensures the preservation of Flamingo’s foundational strengths while simultaneously introducing augmented capabilities. Empirical evaluations across a variety of benchmarks underscore InfiMM’s remarkable capability in multimodal understanding. The code can be found at: https://anonymous.4open.science/r/infimm-zephyr-F60C/.
An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing
Ziwei Chai
|
Guoyin Wang
|
Jing Su
|
Tianjie Zhang
|
Xuanwen Huang
|
Xuwu Wang
|
Jingjing Xu
|
Jianbo Yuan
|
Hongxia Yang
|
Fei Wu
|
Yang Yang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
We present Expert-Token-Routing, a unified generalist framework that facilitates seamless integration of multiple expert LLMs. Our framework represents expert LLMs as special expert tokens within the vocabulary of a meta LLM. The meta LLM can route to an expert LLM like generating new tokens. Expert-Token-Routing not only supports learning the implicit expertise of expert LLMs from existing instruction dataset but also allows for dynamic extension of new expert LLMs in a plug-and-play manner. It also conceals the detailed collaboration process from the user’s perspective, facilitating interaction as though it were a singular LLM. Our framework outperforms various existing multi-LLM collaboration paradigms across benchmarks that incorporate six diverse expert domains, demonstrating effectiveness and robustness in building generalist LLM system via synergizing multiple expert LLMs.
Search
Co-authors
- Hongxia Yang 2
- Haogeng Liu 1
- Quanzeng You 1
- Yiqi Wang 1
- Xiaotian Han 1
- show all...