Especially given the LLM does not trust the user. An LLM can be jailbroken into lowering it's g...

genidoi • today at 2:31 AM • 1 reply • view on HN

Especially given the LLM does not trust the user. An LLM can be jailbroken into lowering it's guardrails, but no amount of rapport building allows you to directly talk about material details of banned topics. Might as well never trust it.

Replies

gverrilla • today at 3:45 AM

I wouldn't trust you either - what topics are you even talking about?

alt Hacker News

Replies