logoalt Hacker News

ACCount37today at 5:08 PM1 replyview on HN

It's the difference between raw LLM output vs LLM output that was tweaked, reviewed and validated by a competent developer.

Both can look like the same exact type of AI-generated code. But one is a broken useless piece of shit and the other actually does what it claims to do.

The problem is just how hard it is to differentiate the two at a glance.


Replies

oceanplexiantoday at 6:28 PM

> It's the difference between raw LLM output vs LLM output that was tweaked, reviewed and validated by a competent developer.

This is one of those areas where you might have been right.. 4-6 months ago. But if you're paying attention, the floor has moved up substantially.

For the work I do, last year the models would occasionally produce code with bugs, linter errors, etc, now the frontier models produce mostly flawless code that I don't need to review. I'll still write tests, or prompt test scenarios for it but most of the testing is functional.

If the exponential curve continues I think everyone needs to prepare for a step function change. Debian may even cease to be relevant because AI will write something better in a couple of hours.

show 1 reply