logoalt Hacker News

wombatpmtoday at 3:19 AM2 repliesview on HN

Couple of observations:

Companies use to hoard talent. Now they are hoarding compute, RAM, and GPUs.

Deepseek showed that there are possibly less expensive ways to train, meaning the future eye watering expenses may not happen.

Bigger models may not scale. The future may be federations of smaller expert models. Chat GPTX doesn’t need to know everything about mental health, it just needs to recognize the the Sigmund von Shrink mental health model needs to answer some of my questions.


Replies

johnsmith1840today at 6:10 AM

Echoing the other comment they showed another big thing which is that the output if an AI model is the AI model. If you mass prompt scrape their AI you can recreate it almost exactly.

Very dangerous if you think about it that the product itself is the raw building block for itself.

Openai spends 1B$ on their model, releases it and instantly it gets scrapped by a million bots to build some country or company their own model.

chipgap98today at 3:30 AM

Deepseek showed that distillation is possible. Their results are possible without someone else doing the leading edge training