logoalt Hacker News

krannertoday at 12:10 PM5 repliesview on HN

> If you ask AI to write a document for you, you might get 80% of the deep quality you’d get if you wrote it yourself for 5% of the effort. But, now you’ve also only done 5% of the thinking.

This, but also for code. I just don't trust new code, especially generated code; I need time to sit with it. I can't make the "if it passes all the tests" crowd understand and I don't even want to. There are things you think of to worry about and test for as you spend time with a system. If I'm going to ship it and support it, it will take as long as it will take.


Replies

jdjdjsshtoday at 1:35 PM

Yep, this is the big sticking point. Reviewing code properly is and was the bottle neck. However, with humans I trusted, I could ignore most of their work and focus on where they knew they needed a review. That kind of trust is worth a lot of money and lets you move really fast.

> I need time to sit with it

Everyone knows doing the work yourself is faster than reviewing somebody elses if you don’t trust them. I’d argue if AI ever gets to the point where you fully trust it, all white collar jobs are gone.

layer8today at 1:00 PM

Yes, regression tests are not enough. One generally has to think through code repeatedly, with different aspects in mind, to convince oneself that it is correct under all circumstances. Tests only point-check, they don’t ensure correct behavior under all conceivable scenarios.

show 1 reply
slfreferencetoday at 12:41 PM

I think what LLMs do with words is similar to what artists do with software like cinema4d.

We have control points (prompts + context) and we ask LLMs to draw a 3D surface which passes through those points satisfying some given constraints. Subsequent chats are like edit operations.

https://youtu.be/-5S2qs32PII

show 1 reply
CuriouslyCtoday at 3:20 PM

You're countering vibes with vibes.

If the tests aren't good enough, break them. Red team your own software. Exploit your systems. "Sitting with the code" is some Henry David Thoreau bullshit, because it provides exactly 0 value to anyone else, whereas red teamed exploits are objective.

show 2 replies
simianwordstoday at 1:27 PM

Honest question: why is this not enough?

If the code passes tests, and also works at the functionality level - what difference does it make if you’ve read the code or not?

You could come up with pathological cases like: it passed the tests by deleting them. And the code written by it is extremely messy.

But we know that LLMs are way smarter than this. There’s very very low chance of this happening and even if it does - it quick glance at code can fix it.

show 4 replies