logoalt Hacker News

ls612yesterday at 10:05 PM2 repliesview on HN

https://github.com/p-e-w/heretic


Replies

solenoid0937today at 5:33 AM

Anyone recommending alliteration ironically proves the argument against open weights from an AI safety perspective.

After a certain level of capability you're proposing handing loaded nukes to everyone. There is an end of the road to the "open models are good" argument and that end is when they start turning into cyber super weapons.

show 1 reply
atemerevyesterday at 10:08 PM

Heretic is a general abliterating framework, mostly used to remove safety alignment, not CCP alignment. Yes, you can put China-specific prompts to it, but you'll need a dataset first (which is available at deccp).

Also Heretic as it is does not work for GLM5.2 (at least as of 3 days ago when I tested it). You'll need some hybrid approaches.