I have been testing agentic coding with Claude 4.5 Opus and the problem is that it's too goo...

minimaxir • last Sunday at 2:30 AM • 2 replies • view on HN

I have been testing agentic coding with Claude 4.5 Opus and the problem is that it's too good at documentation and test cases. It's thorough in a way that it goes out of scope, so I have to edit it down to increase the signal-to-noise.

Replies

girvo • last Sunday at 3:38 AM

The “change capture”/straight jacket style tests LLMs like to output drive me nuts. But humans write those all the time too so I shouldn’t be that surprised either!

➕ show 1 reply

diamond559 • last Sunday at 6:24 AM

If the goal is to document the code and it gets sidetracked and focuses on only certain parts it failed the test. It just further proves llm's are incapable of grasping meaning and context.

alt Hacker News

Replies