logoalt Hacker News

PashaGotoday at 12:12 PM2 repliesview on HN

It would be nice to see some metrics. I think the missing layer here is evaluation. If agents are going to produce applications, the platform needs not only guardrails, but public-ish evidence that those guardrails actually catch failures


Replies

owulverycktoday at 1:56 PM

I fully agree

raicursivetoday at 12:42 PM

[dead]