This looks great.
One thing I’ve been bitten by with desktop agents is execution-time safety: the plan is correct, but a single malformed path or OS call causes real damage.
Do you enforce any guardrails at the tool boundary (e.g. path sandboxing, network allowlists, dry-run / replay)?
Curious how you’re thinking about this.
Phenomenal questions. Sandboxing would be a PHENOMENAL idea. And allowlist it currently is capable of this but does require code changes so configuration based would probably be more what you are referring to?
The replay feature is similar to the record feature. It's not a "guardrail" I would say though.
All stuff that definitely would be great idea.