logoalt Hacker News

godelskitoday at 1:45 AM1 replyview on HN

  > They're still in the process of researching it
I should have taken more care to link a article, but I was trying you link something more clear.

But mind you, everything Waymo does is under research.

So let's look at something newer to see if it's been incorporated

  > We will unpack our holistic AI approach, centered around the Waymo Foundation Model, which powers a unified demonstrably safe AI ecosystem that, in turn, drives accelerated, continuous learning and improvement.

  > Driving VLM for complex semantic reasoning. This component of our foundation model uses rich camera data and is fine-tuned on Waymo’s driving data and tasks. Trained using Gemini, it leverages Gemini’s extensive world knowledge to better understand rare, novel, and complex semantic scenarios on the road.

  > Both encoders feed into Waymo’s World Decoder, which uses these inputs to predict other road users behaviors, produce high-definition maps, generate trajectories for the vehicle, and signals for trajectory validation. 
 
They also go on to explain model distillation. Read the whole thing, it's not long

https://waymo.com/blog/2025/12/demonstrably-safe-ai-for-auto...

But you could also read the actual research paper... or any of their papers. All of them in the last year are focused on multimodality and a generalist model for a reason which I think is not hard do figure since they spell it out


Replies

theamktoday at 5:31 AM

Note this is not end-to-end... All that VLM can do is to "contribute a semantic signal".

So put a fake "detour" sign, so the vehicle thinks it's a detour and starts to follow? Possible. But humans can be fooled like this too.

Put a "proceed" sign so the car runs over the pedestrian, like that article proposes? Get car to hit a wall? Not going to happen.