I’d like to know how they chat-tuned it. Getting the base model is one thing, did they also make a b...

andy99 • yesterday at 11:27 PM • 2 replies • view on HN

I’d like to know how they chat-tuned it. Getting the base model is one thing, did they also make a bunch of conversations for SFT and if so how was it done?

  We develop chatbots while minimizing interference with the normative judgments acquired during pretraining (“uncontaminated bootstrapping”).

So they are chat tuning, I wonder what “minimizing interference with normative judgements” really amounts to and how objective it is.

Replies

jeffjeffbear • today at 12:07 AM

They have some more details at https://github.com/DGoettlich/history-llms/blob/main/ranke-4...

Basically using GPT-5 and being careful

➕ show 4 replies

zozbot234 • today at 12:11 AM

You could extract quoted speech from the data (especially in Q&A format) and treat that as "chat" that the model should learn from.

alt Hacker News

Replies