logoalt Hacker News

mschulkindyesterday at 6:18 PM1 replyview on HN

This is an interesting viewpoint, but isn't it also solveable?

Just because the human in the scenario only took vision as input, why does that matter to the training data and the model? The actions are the same.

To put it another way, what about all the cultural context the human had, or the sounds, smells, past experiences at the same intersection, etc? Even Tesla can't record this, but I'm not sure that matters.


Replies

ai-xyesterday at 11:06 PM

E.g If the driver brakes because they saw a pothole, and Lidar captures someone biking 200m away on their own path, it may mistakenly put more weight on brake causation to the 200m away object (because large moving object) vs the pothole.

I'm exaggerating, but I hope you get the point. It isn't even conflicting sensor signals about the pothole, but conflicting information about the causation. With vision only there is no conflict for the training data. This was my Aha moment. Multiple Sensors are absolutely important for fallback and extra safety, but screws up training that are based on Human Drivers

I think Elon himself doesn't understand this and hence can't articulate it, while just repeating whatever his ML engineer has said.