I wonder if this works better on smaller models than larger ones -- can anyone weigh in? I played a ...

btbuildem • today at 6:38 AM • 0 replies • view on HN

I wonder if this works better on smaller models than larger ones -- can anyone weigh in? I played a bit with the gpt-oss-20b-heretic off HF, and it's frankly still quite refusey.

I've made some changes to the repo (locally) to leverage multiple GPUs and CPU offloading, and had mixed luck with Qwen3 14B. It either completely lobotomizes it into a drooling mess, or has no effect at all.

Some further tweaks enabled abliterating the new Granite models -- there the success rate was higher (1/50 refusals with 0.02 divergence)

If I understand the approach correctly, one could crank the trials count way up, and hope to maximize results that way (minimize refusals and KL divergence).

alt Hacker News