About Me

My name is Shaofei Cai, or 蔡少斐 in Chinese. You can contact me at caishaofei@stu.pku.edu.cn. I am currently pursuing my Ph.D. at the Institute for Artificial Intelligence, Peking University, under the supervision of Professor Yitao Liang. My journey began in 2022. Prior to this, I earned my Bachelor’s degree from Xi’an Jiaotong University in 2019, followed by a Master’s degree from the Institute of Computing Technology, Chinese Academy of Sciences in 2022, under the guidance of Professor Liang Li. My primary research interests involve generative models and sequential control. Currently, I am focused on building generally capable agents in open-world environments, such as Minecraft.

🔥 News

  • 2024.10:  🎉🎉 Our paper ‘‘ROCKET-1: Mastering Open-Wolrd Interaction with Visual-Temporal Context Prompting’’ has been realeased!
  • 2024.10:  🎉🎉 Our paper ‘‘OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents’’ has been accepted by the Neural Information Processing Systems (NeurIPS) 2024.
  • 2024.01:  🎉🎉 Our paper ‘‘GROOT: Learning to Follow Instructions by Watching Gameplay Videos’’ has been accepted by the International Conference on Learning Representations (ICLR) 2024 as a Spotlight presentation (Top-5%).

    📝 Publications

NeurIPS 2024-OWA
sym

ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting

Shaofei Cai, Zihao Wang, Kewei Lian, Zhancun Mu, Xiaojian Ma, Anji Liu, Yitao Liang

NeurIPS 2024 Open-World Agent Workshop | Paper | Cite | Code | Page

ICLR 2024
sym

GROOT: Learning to Follow Instructions by Watching Gameplay Videos

Shaofei Cai, Bowei Zhang, Zihao Wang, Xiaojian Ma, Anji Liu, Yitao Liang

Spotlight Presentation (Top-5%)

ICLR 2024 | Paper | Cite | Code | Page | Twitter

CVPR 2023
sym

Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction

Shaofei Cai, Zihao Wang, Xiaojian Ma, Anji Liu, Yitao Liang

CVPR 2023 | Paper | Cite | Code

Arxiv
sym

OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents

Zihao Wang, Shaofei Cai, Zhancun Mu, Haowei Lin, Ceyao Zhang, Xueije Liu, Qing Li, Anji Liu, Xiaojian Ma, Yitao Liang

NeurIPS 2024 | Paper | Twitter | Page

arxiv
sym

JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models

Zihao Wang, Shaofei Cai, Anji Liu, Yonggang Jin, Jinbing Hou, Bowei Zhang, Haowei Lin, Zhaofeng He, Zilong Zheng, Yaodong Yang, Xiaojian Ma, Yitao Liang

arxiv | Paper | Cite | Code | Page | Twitter

Neurips 2023
sym

Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents

Zihao Wang, Shaofei Cai, Xiaojian Ma, Anji Liu, Yitao Liang

ICML 2023 TEACH Workshop Best Paper Award

Neurips 2023 | Paper | Cite | Code

CVPR 2022
sym

Automatic Relation-aware Graph Network Proliferation

Shaofei Cai, Liang Li, Xinzhe Han, Jiebo Luo, Zheng-jun Zha, Qingming Huang

Oral Presentation (Top-4%)

CVPR 2022 | Paper | Cite | Code | Video | Poster

CVPR 2021
sym

Rethinking Graph Neural Architecture Search from Message-Passing

Shaofei Cai, Liang Li, Jincan Deng, Beichen Zhang, Zheng-Jun Zha, Li Su, Qingming Huang

CVPR 2021 | Paper | Cite | Code

TNNLS 2023
sym

Semantic and Correlation Disentangled Graph Propagation for Multi-label Image Recognition

Shaofei Cai, Liang Li, Xinzhe Han, Qi Tian, Qingming Huang

IEEE Trans. Neural Netw. Learn. Syst. (TNNLS 2023) | Paper | Cite

WACV 2023
sym

DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing

Bingchuan Li*, Shaofei Cai*, Wei Liu, Peng Zhang, Miao Hua, Qian He, Zili Yi

WACV 2023 | Paper | Cite | Code

ACM MM 2020
sym

IR-GAN: Image Manipulation with Linguistic Instruction by Increment Reasoning

Zhenhuan Liu, Jincan Deng, Liang Li, Shaofei Cai, Qianqian Xu, Shuhui Wang, Qingming Huang

Oral Presentation

ACM MM 2020 | Paper | Cite

arxiv
sym

Edge-featured Graph Neural Architecture Search

Shaofei Cai, Liang Li, Xinzhe Han, Zheng-jun Zha, Qingming Huang

arxiv | Paper | Cite

🎖 Honors and Awards

Date Award Contest & Project
2018.11 Gold Medal (Rank: 6th/180) 43rd Asia Beijing Regional (ACM-ICPC)
2018.05 Gold Medal National Xi’an Invitational (ACM-ICPC)
2019.06 Outstanding Student Award Xi’an Jiaotong University

📖 Educations

sym

2022.09 - now, PhD student

Institute for Artificial Intelligence

Peking University, Beijing

sym

2019.09 - 2022.06, Master

Institute of Computing Technology, Chinese Academy of Sciences

University of Chinese Academy of Sciences, Beijing

sym

2015.09 - 2019.06, Undergraduate

Software College

Xi’an Jiaotong University, Xi’an