The most interesting part of it to me (that isn't anything particularly special, but I hadn't seen it before) is giving it full file system access so it'll write it's own tools to come back to later.
It's an obvious move in hindsight, but I hadn't thought of it. Now, the amount of people running it outside of a sandbox or isolated machine and giving it that kind of access would probably make me cry.
Isn’t that just literally Claude Code’s own “make skill” skill?
So much opportunity to build botnets, that I can't even.
The agent making it's own harness idea is really powerful, I gave it a try here with some opinionated choices:
https://github.com/caesarnine/binsmith
Been running it on a locked down Hetzner server + using Tailscale to interact with it and it's been surprisingly useful even just defaulting to Gemini 3 Flash.
It feels like the general shape of things to come - if agents can code then why can't they make their own harness for the very specific environments they end up in (whether it's a business, or a super personalized agent for a user, etc). How to make it not a security nightmare is probably the biggest open question and why I assume Anthropic/others haven't gone full bore into it.