logoalt Hacker News

dvtyesterday at 7:19 PM2 repliesview on HN

Apart from the article being generally just dumb (like, of course you can circumvent guardrails by changing the raw token stream; that's.. how models work), it also might be disrespecting the reader. Looks like it's, at least in part, written by AI:

> The punchline here is that “safety” isn’t a fundamental property of the weights; it’s a fragile state that evaporates the moment you deviate from the expected prompt formatting.

> When the models “break,” they don’t just hallucinate; they provide high-utility responses to harmful queries.

Straight-up slop, surprised it has so many upvotes.


Replies

mr_toadyesterday at 9:52 PM

What’s the AI smell now? Are we not allowed to use semi-colons any more? Proper use of apostrophes? Are we all going to have to write like pre-schoolers to avoid being accused of being AI?

show 2 replies