logoalt Hacker News

andy99yesterday at 11:27 PM2 repliesview on HN

I’d like to know how they chat-tuned it. Getting the base model is one thing, did they also make a bunch of conversations for SFT and if so how was it done?

  We develop chatbots while minimizing interference with the normative judgments acquired during pretraining (“uncontaminated bootstrapping”).
So they are chat tuning, I wonder what “minimizing interference with normative judgements” really amounts to and how objective it is.

Replies

jeffjeffbeartoday at 12:07 AM

They have some more details at https://github.com/DGoettlich/history-llms/blob/main/ranke-4...

Basically using GPT-5 and being careful

show 4 replies
zozbot234today at 12:11 AM

You could extract quoted speech from the data (especially in Q&A format) and treat that as "chat" that the model should learn from.