robotics
RT-1 robotics transformer and SayCan
VIMA General Robot Manipulation with Multimodal Prompts
multimodal model
magma a GPT-style multimodal model that can understand any combination of images and language
Versatile-Diffusion Text, Images and Variations All in One Diffusion Model