logoalt Hacker News

jppope01/21/20255 repliesview on HN

Reasonably speaking, there is no way they can know how they plan to invest $500 billion dollars. The current generation of large language models basically use all human text thats ever been created for the parameters... not really sure where you go after than using the same tech.


Replies

Philpax01/21/2025

That's not really true - the current generation, as in "of the last three months", uses reinforcement learning to synthesize new training data for themselves: https://huggingface.co/deepseek-ai/DeepSeek-R1-Zero

show 2 replies
rapjr901/22/2025

The latest hype is around "agents", everyone will have agents to do things for them. The agents will incidentally collect real-time data on everything everyone uses them for. Presto! Tons of new training data. You are the product.

cavisne01/21/2025

The new scaling vector is “test time compute” ie spending more compute in inference.

jazzyjackson01/21/2025

It seems to me you could generate a lot of fresh information from running every youtube video, every hour of TV on archive.org, every movie on the pirate bay -- do scene by scene image captioning + high quality whisper transcriptions (not whatever junk auto-transcription YouTube has applied), and use that to produce screenplays of everything anyone has ever seen.

I'm not sure why I've never heard of this being done, it would be a good use of GPUs in between training runs.

show 4 replies
riku_iki01/21/2025

I think there is huge amount of corporate knowledge.