They can be base models for a bunch of things. Turning text-conditioned video generation models into...

ACCount37 • today at 2:08 PM • 0 replies • view on HN

They can be base models for a bunch of things. Turning text-conditioned video generation models into robotics VLAs is a fun exercise.

This one is probably too small to be useful for that, and not diverse enough? But I could be wrong.

alt Hacker News