That's not how the economics work. There has been a lot of research that showed how deeper nets...

sigmoid10 • 02/20/2025 • 1 reply • view on HN

That's not how the economics work. There has been a lot of research that showed how deeper nets are more efficient. So if you spend a ton of compute money on a model, you'll want the best output - even though you could just as well build something shallow that may well be state of the art for its depth, but can't hold up with the competition on real tasks.

Replies

llm_trw • 02/20/2025

Which is my point.

You need a ton of specialized knowledge to use compute effectively.

If we had infinite memory and infinite compute we'd just throw every problem of length n to a tensor of size R^(n^n).

The issue is that we don't have enough memory in the world to store that tensor for something as trivial as mnist (and won't until the 2100s). And as you can imagine the exponentiated exponential grows a bit faster than the exponential so we never will.

➕ show 1 reply

alt Hacker News

Replies