logoalt Hacker News

yonatan8070today at 4:42 PM1 replyview on HN

Another issue I've noticed is they're sometimes very resourceful. For example when Codex can't directly edit file due to sandboxing restrictions, rather than asking "hey can I apply this diff on the file", it'd ask for permission to run a `cat EOF` command to re-write the whole file, which the UI doesn't surface properly (just shows the first line...).

This sounds similar to what's described in the "Claude deleted my DB post", it decided "I need to do X", then searched for whatever would let it do X, regardless of intended purpose.


Replies

amlutotoday at 4:56 PM

I amused myself by removing codex-rs’s web search tool and then asking it to search for “foo”. It wrote a Python script to do the search.