logoalt Hacker News

Terrettalast Monday at 7:55 PM2 repliesview on HN

You think someone is, or even should, special case things like estimates? What else deserves that level of intervention so they look less dumb?

Logistics for getting to the car wash next door?

In the mean time, alas, no, we can see from actual prompts sent directly or through sub-agents, and actual replies, estimates remain LLM generated.

Though, this discussion here could change that, because indeed there is a lot of special casing and context stuffing going on, one of the oldest being today's date for example.

• • •

I did read the Claude Code leak, and use pi, etc. So I disagree with your premise rather strongly. Today's "systems" remain, roughly, piles of markdown and context engineering wrapped in UI affordances, and behave very similarly today to how they did in 2024 for those already engineering context and delegating.


Replies

ghshephardlast Monday at 10:22 PM

I do a lot of code bisecting with Claude Code - and it spends hours running experiments - looking at experiment results, making guesses as to what to try next for an experiment - until it eventually comes around to a working code pattern. I mean - maybe this is as much a reflection on me as anything else - but it's pattern of logic isn't that much different from what I would do. It knows, in general, what tools and APIs it can call - it tries something - observes the result, and then comes back and tries different experiments based on success/failure - mostly efficiently bisecting to a solution.

I'm still lower-down of the capability scale - as I'm still manually directing agents to do these wiggins loops - obviously the next step up is to direct the code-loops which control the agents. I just haven't got my tooling nailed in place to the point where I find that's more productive.

I actually might agree with you that this is mostly just "next token prediction" - if I can concede that's really all I do as well.

show 1 reply
8notelast Monday at 11:09 PM

rather than special casing, make real data based on chat logs for how long things took both in calendar and chat time