I've seen LLMs implement "creative" workarounds. Example: Sonnet 4.5 couldn't fi...

icedchai • last Friday at 2:03 PM • 1 reply • view on HN

I've seen LLMs implement "creative" workarounds. Example: Sonnet 4.5 couldn't figure out how to authenticate a web socket request using whatever framework I was experimenting with, so it decided to just not bother. Instead, it passed the username as part of the web socket request and blindly trusted that user was actually authenticated.

The application looked like it worked. Tests did pass. But if you did a cursory examination of the code, it was all smoke and mirrors.

Replies

svachalek • last Friday at 6:18 PM

Yeah recently it had an issue getting OIDC working and decided to implement its own, throwing in a few thousand extra lines. I'm sure there were no security holes created in there at all. /s

➕ show 1 reply

alt Hacker News

Replies