A fundamental architectural problem is that they genuinely do not know what a query will cost...

SlinkyOnStairs • today at 6:02 PM • 4 replies • view on HN

A fundamental architectural problem is that they genuinely do not know what a query will cost ahead of time.

Even for a single standalone LLM that's the case, and the 'agentic' layers thrown on top just make that problem exponentially worse.

One'd need to entirely switch away from LLMs to fix this problem.

Replies

babyshake • today at 6:03 PM

Isn't this an orthogonal issue that doesn't affect whether billing is done with credits or money?

zozbot234 • today at 8:06 PM

If the expensive parts of the query happen to work iteratively (especially if agentic), you can act on those loops to bound the cost. Even if it's pure forward generation, you could pause an expensive inference and continue it seamlessly with a cheaper model, adding little to the cost.

tatrions • today at 8:00 PM

[dead]

tatrions • today at 7:49 PM

[dead]

alt Hacker News

Replies