logoalt Hacker News

bentobeantoday at 5:45 AM3 repliesview on HN

> We show that this metric has been consistently exponentially increasing over the past 6 years, with a doubling time of around 7 months.

If true, how much of this is a result of:

1. Genuine technical advancement

or:

2. Shoveling trillions of dollars into compute resources in order to service incoming LLM requests in a way that is completely unrealistic over the long term?

In other words… are we talking about genuine, sustainable innovation that we get to take with us moving forward and benefit from? Or are we talking about an “improvement” that is more akin to a mirage that will eventually disappear when the Ponzi scheme eventually collapses?


Replies

mediamantoday at 6:09 AM

Much of this is due to vastly better posttraining RL, not models that are much bigger. The idea that most of these gains comes from training really big models, or throwing immensely larger amounts of compute at it, is not really true.

emp17344today at 6:03 AM

I wonder how much of this stuff is attributable to true model advancement, or if it’s an improvement in the genetic harness? It’s impossible to separate strict model improvement from improvement in the associated tools.

dghost-devtoday at 5:46 AM

Good point.