Why do people keep ralking about AI as it actually worked? I still don't see ANY proof that it ...

lordkrandel • today at 10:40 AM • 12 replies • view on HN

Why do people keep ralking about AI as it actually worked? I still don't see ANY proof that it doesn't generate a total unmaintainable unsecure mess, that since you didn't develop, you don't know how to fix. Like running a F1 Ferrari on a countryside road: useless and dangerous

Replies

tyleo • today at 12:11 PM

Because it's working for a lot of people. There are people getting value from these products right now. I'm getting value myself and I know several other folks at work who are getting value.

I'm not sure what your circumstances are but even if it's not true for you, it's true for many other people.

➕ show 1 reply

giantg2 • today at 12:49 PM

I see more value on the business side than the tech side. Ask the AI to transcribe images, write an email, parse some excel data, create a prototype, etc. Some of which you might have hired a tech resource to write a script for.

On the tech side I see it saving some time with stuff like mock data creation, writing boiler plate, etc. You still have to review it like it's a junior. You still have to think about the requirements and design to provide a detailed understanding to them (AI or junior).

I don't think either of these will provide 90% productivity gains. Maybe 25-50% depending on the job.

MrScruff • today at 7:10 PM

I see a lot of people talk about 'insecure code' and while I don't doubt that's true, there's a lot of software development where security isn't actually a concern because there's no need for the software to be 'secure'. Maintainability is important I'll grant you.

AStrangeMorrow • today at 2:52 PM

For me, the main thing is to never have it write anything based on the goal (what the end result should look like and how it should behave). And only on the implementation details (and coding practices that I like).

Sure it is not as fast to understand as code I wrote. But at least I mostly need to confirm it followed how it implemented what I asked. Not figuring out WHAT it even decided to implement in the first place.

And in my org, people move around projects quite a bit. Hasn’t been uncommon for me to jump in projects with 50k+ lines of code a few times a year to help implement a tricky feature, or help optimize things when it runs too slow. Lots of code to understand then. Depending on who wrote it, sometimes it is simple: one or two files to understand, clean code. Sometimes it is an interconnected mess and imho often way less organized that Ai generated code.

And same thing for the review process, lots of having to understand new code. At least with AI you are fed the changes a a slower pace.

alexjplant • today at 2:24 PM

> Why do people keep ralking about AI as it actually worked?

Because it does.

> I still don't see ANY proof that it doesn't generate a total unmaintainable unsecure mess, that since you didn't develop, you don't know how to fix.

I wouldn't know since it's been years since I've tried but I'd imagine that Claude Code would indeed generate a half-baked Next.js monstrosity if one-shot and left to its own devices. Being the learned software engineer I am, however, I provide it plenty of context about architecture and conventions in a bootstrapped codebase and it (mostly) obeys them. It still makes mistakes frequently but it's not an exaggeration to say that I can give it a list of fields with validation rules and query patterns and it'll build me CRUD pages in a fraction of the time it'd take me to do so.

I can also give it a list of sundry small improvements to make and it'll do the same, e.g. I can iterate on domain stuff while it fixes a bunch of tiny UX bugs. It's great.

UqWBcuFx6NV4r • today at 2:17 PM

Person that can’t use a hammer “hasn’t seen any proof” that hammers work.

thefounder • today at 12:30 PM

You can launch a new product in one month instead of 12 months. I think this works best for startups where the risk tolerance is high but works less than ideal for companies such Amazon where system failure has high costs

➕ show 1 reply

lukev • today at 1:13 PM

I agree, but absence of evidence is not evidence of absence, and we currently have a lot of developers who feel very productive right now.

We are very much in need of an actual way to measure real economic impact of AI-assisted coding, over both shorter and longer time horizons.

There's been an absolute rash of vibecoded startups. Are we seeing better success rates or sales across the industry?

➕ show 1 reply

fbrncci • today at 1:30 PM

It works. You’re just not doing it right if it doesn’t work for you. It’s hard to convince me otherwise at this point.

Kim_Bruning • today at 12:18 PM

Consider it this way as a reasoning step: We've invented a cross compiler that can handle the natural languages too. That's definitely useful; but it's still GIGO so you still need your brain.

K0balt • today at 2:28 PM

I’ve been using it to develop firmware in c++. Typically around 10-20 KLOC. Current projects use Sensors, wire protocols, RF systems , swarm networks, that kind of stuff integrated into the firmware.

If you use it correctly, you can get better quality, more maintainable code than 75% of devs will turn in on a PR. The “one weird trick” seems to be to specify, specify, specify. First you use the LLM to help you write a spec (document, if it’s pre existing). Make sure the spec is correct and matches the user story and edge cases. The LLM is good at helping here too. Then break down separations of concerns, APIs, and interfaces. Have it build a dependency graph. After each step, have it reevaluate the entire stack to make sure it is clear, clean, and self consistent.

Every step of this is basically the AI doing the whole thing, just with guidance and feedback.

Once you’ve got the documentation needed to build an actual plan for implementation, have it do that. Each step, you go back as far as relevant to reevaluate. Compare the spec to the implementation plan, close the circle. Then have it write the bones, all the files and interfaces, without actual implementations. Then have it reevaluate the dependency graph and the plan and the file structure together. Then start implementing the plan, building testing jigs along the way.

You just build software the way you used to, but you use the LLM to do most of the work along the way. Every so often, you’ll run into something that doesn’t pass the smell test and you’ll give it a nudge in the right direction.

Think of it as a junior dev that graduated top of every class ever, and types 1000wpm.

Even after all of that, I’m turning out better code, better documentation, and better products, and doing what used to take 2 devs a month, in 3 or 4 days on my own.

On the app development side of our business, the productivity gain also strong. I can’t really speak to code quality there, but I can say we get updates in hours instead of days, and there are less bugs in the implementations. They say the code is better documented and easier to follow , because they’re not under pressure to ship hacky prototype code as if it were production.

On the current project, our team size is 1/2 the size it would have been last year, and we are moving about 4x as fast. What doesn’t seem to scale for us is size. If we doubled our team size I think the gains would be very small compared to the costs. Velocity seems to be throttled more by external factors.

I really don’t understand where people are coming from saying it doesn’t work. I’m not sure if it’s because they haven’t tried a real workflow, or maybe tried it at all, or they are definitely “holding it wrong.” It works. But you still need seasoned engineers to manage it and catch the occasional bad judgment or deviation from the intention.

If you just let it, it will definitely go off the rails and you’ll end up with a twisted mess that no one can debug. But use a system of writing the code incrementally through a specification - evaluation loop as you descend the abstraction from idea to implementation you’ll end up winning.

As a side note, and this is a little strange and I might be wrong because it’s hard to quantify and all vibes, but:

I have the AI keep a journal about its observations and general impressions, sort of the “meta” without the technical details. I frame this to it as a continuation of “awareness “ for new sessions.

I have a short set of “onboarding“ documents that describe the vision, ethos, and goals of the project. I have it read the journal and the onboarding docs at the beginning of each session.

I frame my work with the AI as working with it as a “collaborator” rather than a tool. At the end of the day, I remind it to update its journal of reflections about the days work. It’s total anthropomorphism, obviously, but it seems to inspire “trust” in the relationship, and it really seems to up-level the effort that the AI puts in. It kinda makes sense, LLMs being modelled on human activity.

FWIW, I’m not asserting anything here about the nature of machine intelligence, I’m targeting what seems to create the best result. Eventually we will have to grapple with this I imagine, but that’s not today.

When I have forgotten to warm-start the session, I find that I am rejecting much more of the work. I think this would be worth someone doing an actual study to see if it is real or some kind of irresistible cognitive bias.

I find that the work produced is much less prone to going off the rails or taking shortcuts when I have this in the context, and by reading the journal I get ideas on where and how to do a better job of steering and nudging to get better results. It’s like a review system for my prompting. The onboarding docs seem to help keep the model working towards the big picture? Idk.

This “system” with the journal and onboarding only seems to work with some models. GPT5 for example doesn’t seem to benefit from the journal and sometimes gets into a very creepy vibe. I think it might be optimized for creating some kind of “relationship” with the user.

➕ show 1 reply

the_real_cher • today at 12:02 PM

Yeah its great but as the OP said you have to watch every P.R. like a hawk

➕ show 1 reply

alt Hacker News

Replies