Pretrained Vision-Language Representations
Unsupervised Hierarchical Object-Centric Scene Representations
Opportunity: Shared representations facilitate flexible teaching, trust, and explanation between robots and humans.
Challenge: What format should such a representation have, and how to best use it?