logoalt Hacker News

prmphlast Tuesday at 9:08 PM2 repliesview on HN

So why can't the deterministic part of the agent program embed in all these checks?


Replies

colechristensenlast Tuesday at 11:31 PM

It absolutely can, I'm building things to do this for me. Claude Code has hooks that are supposed to trigger upon certain states and so far they don't trigger reliably enough to be useful. What we need are the primitives to build code based development cycles where each step is executed by a model but the flow is dictated by code. Everything today relies too heavily on prompt engineering and with long context windows instruction following goes lax. I ask my model "What did you do wrong?" and it comes back clearly with "I didn't follow instructions" and then gives clear and detailed correct reasons about how it didn't follow instructions... but that's not supremely helpful because it still doesn't follow instructions afterwards.

vidarhlast Wednesday at 10:02 AM

It increasingly is. E.g. if you use Claude Code, you'll notice it "likes" to produce todo lists that rendered specially via the TodoWrite tool that's built in.

But it's also a balance of avoiding being over-prescriptive in tools that needs to support very different workflows, and it's easy to add more specific checks via plugins.

We're bound to see more packaged up workflows over time, but the tooling here is still in very early stages.