I'm so happy someone else says this, because I'm doing exactly the same. I tried to use ag...

prettygood • today at 10:08 AM • 6 replies • view on HN

I'm so happy someone else says this, because I'm doing exactly the same. I tried to use agent mode in vs code and the output was still bad. You read simple things like: "We use it to write tests". I gave it a very simple repository, said to write tests, and the result wasn't usable at all. Really wonder if I'm doing it wrong.

Replies

kace91 • today at 10:42 AM

I’m not particularly proAI but I struggle with the mentality some engineers seem to apply to trying.

If you read someone say “I don’t know what’s the big deal with vim, I ran it and pressed some keys and it didn’t write text at all” they’d be mocked for it.

But with these tools there seems to be an attitude of “if I don’t get results straight away it’s bad”. Why the difference?

➕ show 4 replies

embedding-shape • today at 10:10 AM

You didn't actually just say "write tests" though right? What was the actual prompt you used?

I feel like that matters more than the tooling at this point.

I can't really understand letting LLMs decide what to test or not, they seem to completely miss the boat when it comes to testing. Half of them are useless because they duplicate what they test, and the other half doesn't test what they should be testing. So many shortcuts, and LLMs require A LOT of hand-holding when writing tests, more so than other code I'd wager.

➕ show 2 replies

threecheese • today at 1:39 PM

“Write tests“ may not be enough; provide it with a test harness, and instruct it to “write tests until they pass “. Next would be “your feature isn’t complete without N% coverage”. These require the ‘agentic’ piece, which is at its simplest some prompts run in a loop until an exit condition is met.

tasuki • today at 1:05 PM

> I gave it a very simple repository, said to write tests, and the result wasn't usable at all. Really wonder if I'm doing it wrong.

I think so. The humans should be writing the spec. The AI can then (try to) make the tests pass.

sixtyj • today at 11:07 AM

No, you have similar experience as a lot of people have.

LLMs just fail (hallucinate) in less known fields of expertise.

Funny: Today I have asked Claude to give me syntax how to run Claude Code. And its answer was totally wrong :) So you go to documentation… and its parts are obsolete as well.

LLM development is in style “move fast and break things”.

So in few years there will be so many repos with gibberish code because “everybody is coder now” even basketball players or taxi drivers (no offense, ofc, just an example).

It is like giving F1 car to me :)

agumonkey • today at 10:30 AM

you need to write a test suite to check his test generation (soft /s)

alt Hacker News

Replies