Xu Shaocong
2023
Foundation Models for Robotics: Best Known Practices
Xu Shaocong
|
Zhao Hao
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 4: Tutorial Abstracts)
“Artificial general intelligence (AGI) used to be a sci-fi word but recently the surprising general-ization capability of foundation models have triggered a lot of attention to AGI, in both academiaand industry. Large language models can now answer questions or chat with human beings,using fluent sentences and clear reasoning. Diffusion models can now draw pictures of unprece-dented photo-realism, according to human commands and controls. Researchers have also madesubstantial efforts to explore new possibilities for robotics applications with the help of founda-tion models. Since this interdisciplinary field is still under fast development, there is no clearmethodological conclusions for now. In this tutorial, I will briefly go through best known prac-tices that have shown transformative capabilities in several sub-fields. Specifically, there are fiverepresentative paradigms: (1) Using foundation models to allow human-friendly human-car in-teraction; (2) Using foundation models to equip robots the capabilities of understanding vaguehuman needs; (3) Using foundation models to break down complex tasks into achievable sub-tasks; (4) Using foundation models to composite skill primitives so that reinforcement learningcan work with sparse rewards; (5) Using foundation models to bridge languge commands andlow-level control dynamics. I hope these best known practices to inspire NLP researchers.”