I definitely agree that in principle a computer can drive with cameras alone. I don't know whether it's a useful statement. Like a human can determine the genre of a movie merely by watching it. I wouldn't suggest to blockbuster in 1990 that they should collect no genre metadata for movies because the database server should automatically sort it out on its own. (Nowadays somewhat feasible with ML of course, but 20+ years later.) What sensors/data you need is a question of where computers are now or will shortly be, and it seems that for now they need the extra structure of LIDAR for best effectiveness.
>I definitely agree that in principle a computer can drive with cameras alone.
Obvious things first, cameras have way worse contrast and low light sensitivity than human eyes.
Humans have much more evolved logical thinking capacity, even the stupid ones can figure stuff out that modern AI struggles with.
Humans have other sensors, too that they use to plausibility check the picture they see. I.e. one of the best sensor fusion systems on the planet.
When in doubt humans can figure out whether it's a lens occlusion or a some other artifact in their vision by virtue of moving their head around.
There's probably other things I'm not thinking of. In any case to make full self driving work we should first start by using all available tech to make it safe. When you have safe tech you can slowly start removing individual sensors while verifying that safety remains high. As the experience and system evolves there will be optimization potential.
And until we have that low light thing and high contrast figured out, camera alone doesn't cut it.