logoalt Hacker News

ACCount37yesterday at 7:39 PM1 replyview on HN

Extracted system prompts are usually very, very accurate.

It's a slightly noisy process, and there may be minor changes to wording and formatting. Worst case, sections may be omitted intermittently. But system prompts that are extracted by AI-whispering shamans are usually very consistent - and a very good match for what those companies reveal officially.

In a few cases, the extracted prompts were compared to what the companies revealed themselves later, and it was basically a 1:1 match.

If this "soul document" is a part of the system prompt, then I would expect the same level of accuracy.

If it's learned, embedded in model weights? Much less accurate. It can probably be recovered fully, with a decent level of reliability, but only with some statistical methods and at least a few hundred $ worth of AI compute.


Replies

simonwyesterday at 8:09 PM

It's not part of the system prompt.