In theory you still use the same blob (i.e. the prompt) to tell the model what to do, but practicall...

kouteiheika • today at 7:12 AM • 1 reply • view on HN

In theory you still use the same blob (i.e. the prompt) to tell the model what to do, but practically it pretty much stops becoming an in-band signal, so no.

As I said, the best way to do this is to inject a brand new special token into the model's tokenizer (one unique token per task), and then prepend that single token to whatever input data you want the model to process (and make sure the token itself can't be injected, which is trivial to do). This conditions the model to look only at your special token to figure out what it should do (i.e. it stops being a general instruction following model), and only look at the rest of the prompt to figure out the inputs to the query.

This is, of course, very situational, because often people do want their model to still be general-purpose and be able to follow any arbitrary instructions.

Replies

zahlman • today at 7:51 AM

> and make sure the token itself can't be injected, which is trivial to do

Are they actually doing this? The stuff that Anthropic has been saying about the deliberate use of XML-style markup makes me wonder a bit.

➕ show 1 reply

alt Hacker News

Replies