One thing I notice is that they specify that the robot has never seen the homes before, but certain objects, like the laundry baskets, are identical.
Doing your demo is significantly easier if you've already programmed/trained the robot to recognize the specific objects it has to interact with, even if those items are in different locations.
isn't object recognition essentially solved? AI models were beating humans at image classification (in terms of error rate) back in 2016. even if this particular model isn't the best at it, they can always call out to an API or have a secondary on-device VLM that has stronger object recognition capabilities
They also got these things working corners of a location instead of stacking tasks on different areas of the same location. And even on these "one-area" task groups it can fail a good amount. Kudos to them for showing the failures though