The technology already exists now on the algorithmic front for the next 10x drop between everyone adopting DeepSeek's MLA, MoE (mostly already done), Medusa (a better version of Google's speculative decoding), Kimi's Attn Residuals, and Mimo's Sliding Window Attn, and (possibly) Microsoft's 1.58b (this may be a nothing burger).
Historic trends, every 18 months, performance for the same level of quality has gone down 90%.
See: https://www.reddit.com/r/LocalLLaMA/comments/1gpr2p4/llms_co...
And Chart 13 here: https://www.rdworldonline.com/ais-great-compression-20-chart...
And here: https://epoch.ai/data-insights/llm-inference-price-trends
Historically, algorithmic gains are only ~30% of the pie, but there's enough out there to get to 10x, with just what's available already. The other ~70% of the pie is better training data (often synthetic) and distilling frontier knowledge. There's no sign we are tapped out on that front.
Additionally, GRAM (from ~10 days ago) is likely to be a 5-10x on its own (if not substantially more for smaller models). It's unlikely within 4 years LeCun's JEPA ideas and similar ideas like GRAM applied to LLMs have ZERO impact. The preliminary results are absolutely astounding (5000x better reasoning - this is not peanuts).
Further, that's not even counting that cost per watt is still dropping ~2x every 2 years on its own on the hardware front.
If you look at the "cost" of inference. People think it's electricity - but it's currently almost ~80% hardware amortization. The memory shortage is not going to last, nor are Nvidia's ~80-90% margins.
The human brain is still 8-10 orders of magnitude more efficient than the best LLMs of today. With ~1/10th of global capex riding on AI, if you don't think they're going to knock of 2 orders of magnitude more, when it's this obvious and easy... I don't know what to tell you...
Sure, it might take 6 years instead of 4. My crystal ball isn't perfect.
I didn't take you seriously initially but after reading this, i think you are the real deal.
Thank you for sharing this and for having the intellectual courage to hold to a sound reasoning that may be unpopular initially.
This is mostly slop. But you may be directionally correct
Sure, the price will come down a lot, even if we can argue about the timeline.
I think what will also happen, once we get past this current CEO AI FOMO mania, is that companies will start to look at AI spending more rationally like any other company expense, and will revert to more rational decision making.
Even if the cost comes down considerably over the next few years, that's plenty of time for companies to look at their financial results and question why AI expenditure isn't resulting in increase in revenue and/or profitability.