The pattern we keep converging on is to treat model calls like a budgeted distributed system, not li...

wuweiaxin • today at 6:05 PM • 0 replies • view on HN

The pattern we keep converging on is to treat model calls like a budgeted distributed system, not like a magical API. The expensive failures usually come from retries, fan-out, and verbose context growth rather than from a single bad prompt. Once we started logging token use per task step and putting hard ceilings on planner depth, costs became much more predictable.

alt Hacker News