An interesting little example of this problem is initial prompting, which is effectively just a permanent, hidden context that can't be cleared. On Twitter right now, the "Grok" bot has recently begun frequently mentioning "White Genocide," which is, y'know, odd. This is almost certainly because someone recently adjusted its prompt to tell it what its views on white genocide are meant to be, which for a perfect chatbot wouldn't matter when you ask it about other topics, but it DOES matter. It's part of the context. It's gonna talk about that now.
Well, telling an AI chatbot to insist on discussing a white genocide seems like a perfectly Elon thing to do!
> This is almost certainly because someone recently adjusted its prompt to tell it what its views on white genocide are
Do you have any source on this? System prompts get leaked/extracted all the time so imagine someone would notice this
Edit: just realized you’re talking about the Grok bot, not Grok the LLM available on X or grok.com. With the bot it’s probably harder to extract its exact instructions since it only replies via tweets. For reference here’s the current Grok the LLM system prompt: https://github.com/asgeirtj/system_prompts_leaks/blob/main/g...
Probably because it is now learning from a lot of videos posted on X by misc right-wingers showing rallying cries of South African politicians like Julius Malema, Paul Mashatile etc. Not very odd.
As merely 3 of over a dozen examples:
https://x.com/DefiantLs/status/1922213073957327219
Ah, Elon paying attention to hid companies again!
Context poisoning is not a uniquely LLM problem
> This is almost certainly because someone recently adjusted its prompt to tell it what its views on white genocide are meant to be
Well, someone did something to it; whether it was training, feature boosting the way Golden Gate Claude [0] was done, adjusting the system prompt, or assuring that it's internet search for contextual information would always return material about that, or some combination of those, is neither obvious nor, if someone had a conjecture as to which one or combination it was, easily falsifiable/verifiable.
[0] https://www.anthropic.com/news/golden-gate-claude