Generative AI is taking automated common-sense reasoning, task planning, and perception to a new level. It is also revolutionizing synthetic data generation, human-computer interaction, and multimodal understanding. Collectively, these are some of the key capabilities required for robots to understand our world and provide humanity with accessible, versatile physical assistance for day-to-day tasks. The key missing ingredient is for generative AI to also understand physical interaction. In this NVIDIA GTC 2024 session, Vincent Vanhoucke of Google sketches a future in which embodied AI is a natural extension of the revolution that large multimodal models are ushering, and explores the implications for the future of collaborative robotics and human-centered AI at large.

  • Published: 2024/4/12