This article answers the question in the second paragraph then completely ignores the answer for the...

noosphr • yesterday at 9:33 PM • 3 replies • view on HN

This article answers the question in the second paragraph then completely ignores the answer for the rest of it.

>My understanding is that this represents 3-4 “generations” of different technology (propellers, turbojets, etc). Each technology went through normal iterative improvement, then, when it reached its fundamental limits, got replaced by a better technology. The last technology, ramjets, reached its limit at about 3500 km/h, and there wasn’t the economic/regulatory will to develop anything better, so the record stands.

You don't have one sigmoid, you have multiple each stacked on top of each other. Airplanes aren't just one technology they are multiple technologies that happen to do the same thing.

Each one is following a sigmoid perfectly. It only looks exponential(ish) because of unpredictable discoveries that let you switch to another sigmoid that has a higher maximum potential.

The same is true in AI. If you used the same architecture as GPT2 today you're in for a bad time training a new frontier model. It's only because we have dozens of breakthroughs that the capabilities of models have improved as much as they have.

That said exponential and sigmoids are the wrong model to use for growth. Growth is a differential equation. It has independent inputs, it has outputs and some of those outputs are dependent inputs again through causal chains of arbitrary complexity. What happens depends entirely on what the specific DE that governs the given technology is. We can easily have a chaotic system with completely random booms and busts which have no deep fundamental rhyme or reason. We currently call that the economy.

Replies

mediaman • yesterday at 9:57 PM

The book "Origins of Efficiency" by Brian Potter discusses this. Stacked sigmoids are a well-understood idea in innovation.

The idea that exponential growth will continue with stacked sigmoids is also not a given. An example is the nail. Nails used to be about half a percent of US GDP. That's a pretty big number! A series of innovations stacked on each other (each innovation having its own sigmoid) to reduce the cost of nails. Nails dropped in cost by over 90%.

But eventually nail manufacturing reached a floor. And since the mid-20th century, we haven't gotten much better at making nails. The cost of nails actually started increasing slightly. We ran out of new innovation sigmoids, so we got stuck on the last one.

So what you actually have to predict is whether there will continue to be new sigmoids, not whether the existing sigmoid will asymptote (we already know it will).

This is much more difficult to forecast, because new sigmoids (major new innovations) tend to be unpredictable events. Not only are the particulars difficult to forecast (if they were knowable, the innovation would have already happened), but whether there will be a major innovation or not is also hard to forecast, because they are distinct and separate from any existing sigmoid trend.

So we are left with the idea that all current innovations in AI will asymptote in their scaling as they reach the plateau of the sigmoid, but there may be new sigmoids that keep the overall trend up. Or there may not be. We don't know.

That's not very satisfying, so we'll get to keep reading articles like this one.

Sniffnoy • yesterday at 9:36 PM

Yes, I was surprised he never discussed the idea that such exponentials are typically made of stacked sigmoids.

That said... if the exponential is made of stacked sigmoids, it's still an exponential on the whole! The fact that it's made of stacked sigmoids is relevant to the engineers making it, but not so relevant to the users or those otherwise affected by it.

➕ show 1 reply

Scene_Cast2 • yesterday at 10:24 PM

Something that deeply frustrates me, as someone who did R&D on model architectures, is how similar the modern LLM model architectures are to GPT2.

(This is a bit disingenuous, as lots/most of work is spent on the scaling and training side of things.)

➕ show 1 reply

alt Hacker News

Replies