Sometimes I wonder if LLM proponents even understand their own bullshit.
It's all just tokens in the context window right? Aren't system prompts just tokens that stay appended to the front of a conversation?
They're going to keep dressing this up six different ways to Sunday but it's always just going to be stochastic token prediction.
Yep, every AI call is essentially just asking it to predict what the next word is after:
<system>
You are a helpful assistant.
</system>
<user>
Why is the sky blue?
</user>
<assistant>
Because of Rayleigh scattering. The blue light refracts more.
</assistant>
<user>
Why is it red at sunset then?
</user>
<assistant>
And we keep repeating that until the next word is `</assistant>`, then extract the bit in between the last assistant tags, and return it. The AI has been trained to look at `<user>` differently to `<system>`, but they're not physically different.It's all prompt, it can all be engineered. Hell, you can even get a long way by pre-filling the start of the Assistant response. Usually works better than a system message. That's prompt engineering too.
> Sometimes I wonder if LLM proponents even understand their own bullshit.
Categorically, no. Most are not software engineers, in fact most are not engineers of any sort. A whole lot of them are marketers, the same kinds of people who pumped crypto way back.
LLMs have uses. Machine learning has a ton of uses. AI art is shit, LLM writing is boring, code generation and debugging is pretty cool, information digestion is a godsend some days when I simply cannot make my brain engage with whatever I must understand.
As with most things, it's about choosing the right tool for the right task, and people like AI hype folk are carpenters with a brand new, shiny hammer, and they're gonna turn every fuckin problem they can find into a nail.
Also for the love of god do not have ChatGPT draft text messages to your spouse, genuinely what the hell is wrong with you?
System prompts don't even have to be appended to the front of the conversation. For many models they are actually modeled using special custom tokens - so the token stream looks a bit like:
The models are then trained to (hopefully) treat the system prompt delimited tokens as more influential on how the rest of the input is treated.