logoalt Hacker News

DocTomoelast Friday at 9:51 PM0 repliesview on HN

I'm more concerned that someone decides to prompt for 'analyze these logfiles, then clean up', and the LLM randomly decides the best way to 'clean up' is a 'rm -rf /' - not on the first run, but on the 27th.

That kind of failure mode is fundamentally different from traditional scripting: it passes tests, builds trust, and then fails catastrophically once the implicit interpretation shifts.

In short: I believe it's nice this works for the engineer who knows exactly what (s)he is doing - but those folks usually don't need LLMs, they just write the code. People who this appeals to - and who may not begin to think about side-effects of innocent-sounding prompts - are being given a foot machine gun, which may act like a genie with hilarious unintended consequences.