Not necessarily responding to you directly, but I find this take to be interesting, and I see it eve...

redhale • last Wednesday at 12:35 PM • 11 replies • view on HN

Not necessarily responding to you directly, but I find this take to be interesting, and I see it every time an article like this makes the rounds.

Starting back in 2022/2023:

- (~2022) It can auto-complete one line, but it can't write a full function.

- (~2023) Ok, it can write a full function, but it can't write a full feature.

- (~2024) Ok, it can write a full feature, but it can't write a simple application.

- (~2025) Ok, it can write a simple application, but it can't create a full application that is actually a valuable product.

- (~2025+) Ok, it can write a full application that is actually a valuable product, but it can't create a long-lived complex codebase for a product that is extensible and scalable over the long term.

It's pretty clear to me where this is going. The only question is how long it takes to get there.

Replies

arkensaw • last Wednesday at 1:15 PM

> It's pretty clear to me where this is going. The only question is how long it takes to get there.

I don't think its a guarantee. all of the things it can do from that list are greenfield, they just have increasing complexity. The problem comes because even in agentic mode, these models do not (and I would argue, can not) understand code or how it works, they just see patterns and generate a plausible sounding explanation or solution. agentic mode means they can try/fail/try/fail/try/fail until something works, but without understanding the code, especially of a large, complex, long-lived codebase, they can unwittingly break something without realising - just like an intern or newbie on the project, which is the most common analogy for LLMs, with good reason.

➕ show 1 reply

bayindirh • last Wednesday at 12:44 PM

Well, the first 90% is easy, the hard part is the second 90%.

Case in point: Self driving cars.

Also, consider that we need to pirate the whole internet to be able to do this, so these models are not creative. They are just directed blenders.

➕ show 5 replies

PunchyHamster • last Wednesday at 1:36 PM

Note that blog posts rarely show the 20 other times it failed to build something and only that time that it happened to work.

We've been having same progression with self driving cars and they are also stuck on the last 10% for last 5 years

➕ show 2 replies

Scea91 • last Wednesday at 2:02 PM

> - (~2023) Ok, it can write a full function, but it can't write a full feature.

The trend is definitely here, but even today, heavily depends on the feature.

While extra useful, it requires intense iteration and human insight for > 90% of our backlog. We develop a cybersecurity product.

sanderjd • last Wednesday at 1:05 PM

Yeah maybe, but personally it feels more like a plateau to me than an exponential takeoff, at the moment.

And this isn't a pessimistic take! I love this period of time where the models themselves are unbelievably useful, and people are also focusing on the user experience of using those amazing models to do useful things. It's an exciting time!

But I'm still pretty skeptical of "these things are about to not require human operators in the loop at all!".

➕ show 1 reply

EthanHeilman • last Wednesday at 3:38 PM

I haven't seen an AI successfully write a full feature to an existing codebase without substantial help, I don't think we are there yet.

> The only question is how long it takes to get there.

This is the question and I would temper expectations with the fact that we are likely to hit diminishing returns from real gains in intelligence as task difficulty increases. Real world tasks probably fit into a complexity hierarchy similar to computational complexity. One of the reasons that the AI predictions made in the 1950s for the 1960s did not come to be was because we assumed problem difficulty scaled linearly. Double the computing speed, get twice as good at chess or get twice as good at planning an economy. P, NP separation planed these predictions. It is likely that current predictions will run into similar separations.

It is probably the case that if you made a human 10x as smart they would only be 1.25x more productive at software engineering. The reason we have 10x engineers is less about raw intelligence, they are not 10x more intelligent, rather they have more knowledge and wisdom.

kubb • last Wednesday at 1:45 PM

Each of these years we’ve had a claim that it’s about to replace all engineers.

By your logic, does it mean that engineers will never get replaced?

HarHarVeryFunny • last Wednesday at 2:16 PM

Sure, eventually we'll have AGI, then no worries, but in the meantime you can only use the tools that exist today, and dreaming about what should be available in the future doesn't help.

I suspect that the timeline from autocomplete-one-line to autocomplete-one-app, which was basically a matter of scaling and RL, may in retrospect turn out to have been a lot faster that the next LLM to AGI step where it becomes capable of using human level judgement and reasoning, etc, to become a developer, not just a coding tool.

ugurs • last Wednesday at 3:37 PM

Ok, it can create a long-lived complex codebase for a product that is extensible and scalable over the long term, but it doesn't have cool tattoos and can't fancy a matcha

mjr00 • last Wednesday at 2:00 PM

This is disingenuous because LLMs were already writing full, simple applications in 2023.[0]

They're definitely better now, but it's not like ChatGPT 3.5 couldn't write a full simple todo list app in 2023. There were a billion blog posts talking about that and how it meant the death of the software industry.

Plus I'd actually argue more of the improvements have come from tooling around the models rather than what's in the models themselves.

[0] eg https://www.youtube.com/watch?v=GizsSo-EevA

➕ show 1 reply

itsthecourier • last Wednesday at 12:44 PM

I use it on a 10 years codebase, needs to explain where to get context but successfully works 90% of time

alt Hacker News

Replies