It would be nice to see some metrics. I think the missing layer here is evaluation. If agents are going to produce applications, the platform needs not only guardrails, but public-ish evidence that those guardrails actually catch failures
I fully agree
[dead]
I fully agree