It's not _really_ any different to running an undocumented third party binary. Is it the height of irresponsibility to run Windows, or VSCode, or Spotify?
I think the model we've got now is wrong, and the harnesses should be OS-level sandboxed, and the agents should be running in harness managed sandboxes.