> after using it for months you get a ‘feel’ for what kind of mistakes it makes
Sure, go ahead and bet your entire operation on your intuition of how a non-deterministic, constantly changing black box of software "behaves". Don't see how that could backfire.
What, you don't trust the vibes? Are you some sort of luddite?
Anyways, try a point release upgrade of a SOTA model, you're probably holding it wrong.
> bet your entire operation
What straw man is doing that?
So like every software? Why do you think there are so many security scanners and whatnot out there?
There are millions of lines of code running on a typical box. Unless you're in embedded, you have no real idea what you're running.
not betting my entire operation - if the only thing stopping a bad 'deploy' command destroying your entire operation is that you don't trust the agent to run it, then you have worse problems than too much trust in agents.
I similarly use my 'intuition' (i.e. evidence-based previous experiences) to decide what people in my team can have access to what services.