logoalt Hacker News

jgalt212yesterday at 11:19 PM2 repliesview on HN

Has anyone run a study on how long you can run an agent as root before irreparable damage is done to the VM? A sort of gambler's ruin for the YOLO LLM Age.


Replies

Wowfunhappyyesterday at 11:33 PM

https://forums.macrumors.com/threads/screw-it-lets-make-clau...

For me, it took a bit over six weeks of Claude running unattended perpetually.

nijaveyesterday at 11:49 PM

I gave Sonnet 4.6 root access to my Android via adb and it wrote frida scripts to help me recover the encryption keys from SwiftBackup

Also gave Opus 4.6 access to a Kubernetes container and it was able to use pyrasite (a Python replacement that attached to a running process with gdb) to debug a "memory leak" in Python

I don't think I'd let them run unattended on anything I care about especially if there weren't backups, but they've never tried to break anything while supervised.

Usually it's significantly faster and more accurate to give the LLM/harness access to the thing to debug then to try to copy/paste back and forth.

show 1 reply