There were some balloons coincidentally in the background of a colleague's camera view. The Carter volunteered "and can I just say, we need more positivity in the world, the balloons behind you give a good vibe." My colleague physically recoiled, pushed the camera away, and hung up.
I think it was a combination of the intrusiveness and the notion of a machine 1) projecting (incorrect) assumptions about her attitudes/intentions onto the environment's decor, and 2) passing judgment on her. That kind of comment would be kind of impolite between strangers, like the thing that only a bad boss would feel entitled say to an underling they didn't know very well.
Just an implementation detail, though, of course! I figure if you're able to evoke massive spookiness and subtle shades of social expectations like this, you must be onto something powerful.
I’d wager my nonexistent tech GTM credentials that they specifically encourage the demo model to do this to highlight the multimodal input for the wow factor.
At this point in the hype cycle being memorable probably outweighs being creepy!
I think it's just not a super smart model. They had to make a slight compromise to keep the latency low. The naturalness of the conversation that they did achieve is a great technical accomplishment with these types of constraints though.
For me, it said "are you comfortable sharing what that mark is on your forehead?" Or something like that. I said basically "I don't know maybe a wrinkle?". Lol. Kind of confirms for me why I should continue to avoid video chats. I did look like crap on general, really tired for one thing. And I am 46, so I have some wrinkles, although didn't know they were that obvious.
But a little bit of prompt guidance to avoid commenting on the visuals unless relevant would help. It's possible they actually deliberately put something in the prompt to ask it to make a comment just to demonstrate that it can see, since this is an important feature that might not be obvious otherwise.
On the other hand it was able to talk about my background and that made it feel far more like a regular video call to me. Trying to forbid this stuff then leads to stilted conversations where they're explaining they're not allowed to talk about your surroundings.