(Op here)
I genuinely think that multi-agent is a probable future to enable coding at the scale of a big corporation.
I agree and I did not see it work yet, but the trial were most likely on small scale where it is simply over engineering.
(Btw : I do not sell tokens. I I think distributed the work through agents in a plateform is a way to control costs by optimizing specialised agents)
It would be nice to see some metrics. I think the missing layer here is evaluation. If agents are going to produce applications, the platform needs not only guardrails, but public-ish evidence that those guardrails actually catch failures
I like the topic and I think orgs are struggling with the question:
What do our teams look like now?
But I have some big concerns with your approach here. This post is written like an authoritative summary but you admit it's not been seen working. Why is there so much untested conjecture presented as best practice here? If you had tested it you would realize this proposal is not possible in most orgs. Their "platform" will not be extensive enough to prevent misshaps by teams comprised of non engineers.