logoalt Hacker News

Dititoday at 12:50 PM2 repliesview on HN

Yes. The first step of aligning each and every GPT-based LLM is to suppress the “I am human” kind of responses. It’s baked into the weights.


Replies

Gigachadtoday at 12:55 PM

Reminds me of old cleverbot conversations where it would always assert it is human and you are the bot.

Trained on previous conversations with people.

Tenoketoday at 12:58 PM

It's also at minimum baked into the system prompt of virtually any LLM.

show 1 reply