[Arxiv 23.06] Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots
-
Install ROS Noetic and Movelt
-
Clone GroundingDINO for open-vocab object detection
git clone https://github.com/IDEA-Research/GroundingDINO.git
-
Clone MixFormer for object tracking:
git clone https://github.com/MCG-NJU/MixFormer.git
-
Clone Segment-Anything for segmentation:
https://github.com/facebookresearch/segment-anything.git
-
Prepare and process your own data
-
Prepare the the environment
conda env create -f environment.yaml
-
Train our two-stream policy
python train.py
- Run inference
python inference/inference.py
Please cite the following paper if you feel this repository useful for your research.
@InProceedings{Yang_2025_WACV,
author = {Yang, Jiange and Tan, Wenhui and Jin, Chuhao and Yao, Keling and Liu, Bei and Fu, Jianlong and Song, Ruihua and Wu, Gangshan and Wang, Limin},
title = {Transferring Foundation Models for Generalizable Robotic Manipulation},
booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)},
month = {February},
year = {2025},
pages = {1999-2010}
}
@article{yang2023transferring,
title={Transferring foundation models for generalizable robotic manipulation},
author={Yang, Jiange and Tan, Wenhui and Jin, Chuhao and Yao, Keling and Liu, Bei and Fu, Jianlong and Song, Ruihua and Wu, Gangshan and Wang, Limin},
journal={arXiv preprint arXiv:2306.05716},
year={2023}
}
@article{yang2023pave,
title={Pave the way to grasp anything: Transferring foundation models for universal pick-place robots},
author={Yang, Jiange and Tan, Wenhui and Jin, Chuhao and Liu, Bei and Fu, Jianlong and Song, Ruihua and Wang, Limin},
journal={arXiv preprint arXiv:2306.05716},
volume={1},
number={2},
year={2023}
}
Thanks to the open source of the following projects: