As I understand it, they mean both computer vision and video gen, linked by a pretty robust world mo...

swiftcoder • today at 2:25 PM • 0 replies • view on HN

As I understand it, they mean both computer vision and video gen, linked by a pretty robust world model. One of their hosted examples is purely analysing an existing video, the other is predicting (i.e. video gen) from a static image to a video

alt Hacker News