Human-in-the-loop Robotic Grasping Using BERT Scene Representation

Yaoxian Song; Penglei Sun; Pengfei Fang; Linyi Yang; Yanghua Xiao; Yue Zhang

Human-in-the-loop Robotic Grasping Using BERT Scene Representation

Yaoxian Song, Penglei Sun, Pengfei Fang, Linyi Yang, Yanghua Xiao, Yue Zhang

Abstract

Current NLP techniques have been greatly applied in different domains. In this paper, we propose a human-in-the-loop framework for robotic grasping in cluttered scenes, investigating a language interface to the grasping process, which allows the user to intervene by natural language commands. This framework is constructed on a state-of-the-art grasping baseline, where we substitute a scene-graph representation with a text representation of the scene using BERT. Experiments on both simulation and physical robot show that the proposed method outperforms conventional object-agnostic and scene-graph based methods in the literature. In addition, we find that with human intervention, performance can be significantly improved. Our dataset and code are available on our project website https://sites.google.com/view/hitl-grasping-bert.

Anthology ID:: 2022.coling-1.265
Volume:: Proceedings of the 29th International Conference on Computational Linguistics
Month:: October
Year:: 2022
Address:: Gyeongju, Republic of Korea
Editors:: Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:: COLING
SIG:
Publisher:: International Committee on Computational Linguistics
Note:
Pages:: 2992–3006
Language:
URL:: https://aclanthology.org/2022.coling-1.265/
DOI:
Bibkey:
Cite (ACL):: Yaoxian Song, Penglei Sun, Pengfei Fang, Linyi Yang, Yanghua Xiao, and Yue Zhang. 2022. Human-in-the-loop Robotic Grasping Using BERT Scene Representation. In Proceedings of the 29th International Conference on Computational Linguistics, pages 2992–3006, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):: Human-in-the-loop Robotic Grasping Using BERT Scene Representation (Song et al., COLING 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.coling-1.265.pdf

PDF Cite Search Fix data