logoalt Hacker News

postalcodertoday at 8:29 AM1 replyview on HN

From the paper:

> Datasets. We construct a diverse and high-quality collection of video datasets to train STARFlow-V. Specifically, we leverage the high-quality subset of Panda (Chen et al., 2024b) mixed with an in-house stock video dataset, with a total number of 70M text-video pairs.


Replies

justinclifttoday at 10:22 AM

> in-house stock video dataset

Wonder if "iCloud backups" would be counted as "stock video" there? ;)

show 2 replies