Shao-Yen Tseng
2024
Why do LLaVA Vision-Language Models Reply to Images in English?
Musashi Hinck
|
Carolin Holtermann
|
Matthew Lyle Olson
|
Florian Schneider
|
Sungduk Yu
|
Anahita Bhiwandiwalla
|
Anne Lauscher
|
Shao-Yen Tseng
|
Vasudev Lal
Findings of the Association for Computational Linguistics: EMNLP 2024
2023
ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning
Xiao Xu
|
Bei Li
|
Chenfei Wu
|
Shao-Yen Tseng
|
Anahita Bhiwandiwalla
|
Shachar Rosenman
|
Vasudev Lal
|
Wanxiang Che
|
Nan Duan
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
2022
KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation
Yongfei Liu
|
Chenfei Wu
|
Shao-Yen Tseng
|
Vasudev Lal
|
Xuming He
|
Nan Duan
Findings of the Association for Computational Linguistics: NAACL 2022
Co-authors
- Vasudev Lal 3
- Chenfei Wu 2
- Anahita Bhiwandiwalla 2
- Nan Duan 2
- Xiao Xu 1
- show all...