Exactly. There's a difference between vibe coding and agentic software engineering. One is just prompting and hoping for the best. It works surprisingly well, up to a point. And then it doesn't. If that's happening to you, you might be doing it wrong. The other is forcing agents to do it right. Working in a TDD way, cleaning up code that needs cleaning up, following processes with checklists, etc. You need to be diligent about what you put in there and there's a lot of experience that translates into knowing what to ask for and how. But it boils down to being a bit strict and intervening when it goes off the rails and then correcting it via skills such that it won't happen again.
I've been working on an Ansible code base in the past few weeks. I manually put that together a few years ago and unleashed codex on it to modernize it and adapt it to a new deployment. It's been great. I have a lot of skills in that repository that explain how to do stuff. I'm also letting codex run the provisioning and do diagnostics. You can't do that unless you have good guard rails. It's actually a bit annoying because it will refuse to take short cuts (where I would maybe consider) and sticks to the process.
I actually don't write the skills directly. I generate them. Usually at the end of a session where I stumbled on something that works. I just tell it to update the repo local skills with what we just did. Works great and makes stuff repeatable.
I'm at this point comfortable generating code in languages I don't really use myself. I currently have two Go projects that I'm working on, for example. I'm not going to review a lot of that code ever. But I am going to make sure it has tests that prove it implements detailed specifications. I work at the specification level for this. I think a lot of the industry is going to be transitioning that direction.