How long until we get to the point where models know that LLMs get this wrong, and that it is an LLM...

MrCheeze • 01/21/2025 • 1 reply • view on HN

How long until we get to the point where models know that LLMs get this wrong, and that it is an LLM, and therefore answers wrong on purpose? Has this already happened?

(I doubt it has, but there ARE already cases where models know they are LLMs, and therefore make the plausible but wrong assumption that they are ChatGPT.)

Replies

sebastiennight • 01/27/2025

My understanding is that the model does not "know" it is an LLM. It is prompted (in the app's system prompt) or trained during RLHF to answer that it is an LLM.

alt Hacker News

Replies