Other than being borderline impossible to secure, it “should just work” once the AIs get smart enough.
Fine-tuning the model based on example pages and responses might be all that’s required for a sufficient level of consistency.
An immediate use-case might be prototyping in-place.
If you have an existing site, you can capture the request-response pairs and train the AI on it, annotated with the spec docs. Then tell it to implement some new functionality and it should be able to. Just route a subset of the site to the AI instead of the normal controllers.
One could “design” new components and functionality in English and try it instantly with no compilation or deployment steps!