logoalt Hacker News

embedding-shapeyesterday at 4:14 PM3 repliesview on HN

Optuna is a generally useful project, that I'm surprised isn't used in more places in the ecosystem. The ability to do what they're doing here, incrementally find the best hyperparameter to use can really make a large difference in how quickly you can move past having to fine-tune those values. Basically any time you aren't sure about the perfect value, throw Optuna on it with a quick script, and make it go for a broad search first, then narrow it down, and you can let the computer figure out the best values.

Nicely done to pair that with something as fun as censorship removal, currently in the process on running it on gpt-oss-120b, eager to see the results :) I'm glad that someone seems to be starting to take the whole "lobotimization" that happens with the other processes seriously.


Replies

Qwukeyesterday at 4:41 PM

I've seen Optuna used with some of the prompt optimization frameworks lately, where it's a really great fit and has yielded much better results than the "hyperparameter" tuning I had attempted myself. I can't stop mentioning how awesome a piece of software it is.

Also, I'm eager to see how well gpt-oss-120b gets uncensored if it really was using the phi-5 approach, since that seems fundamentally difficult given the training.

show 1 reply
zeld4yesterday at 4:24 PM

curious to see your result/spec/time

p-e-wyesterday at 5:11 PM

Please let me know if you encounter any problems with the 120b! I'm really interested in how well it will work. When presented with the Pareto front at the end, I recommend choosing a configuration with a KL divergence below 1, even if the refusal rate seems high. The gpt-oss models are trained to do an internal monologue about refusing in the CoT, so the actual refusal rate is often substantially lower because Heretic's refusal classifier gets confused by the trigger words.