The first users of this dataset will be Big Tech corps. Meta, Alphabet, OpenAI, Microsoft, Apple wil...

madduci • today at 7:50 AM • 1 reply • view on HN

The first users of this dataset will be Big Tech corps. Meta, Alphabet, OpenAI, Microsoft, Apple will all be happy to use this dataset for training their LLMs.

For them, 300TB is just cheap

Replies

ipsum2 • today at 9:16 AM

They already have this data. See jukebox from OpenAI, released before chatgpt.

alt Hacker News

Replies