logoalt Hacker News

cyanydeezyesterday at 8:22 PM3 repliesview on HN

Basically, the only way you're separting user input from model meta-input is using some kind of character that'll never show up in the output of either users or LLMs.

While technically possible, it'd be like a unicode conspiracy that had to quietly update everywhere without anyone being the wiser.


Replies

Lercyesterday at 11:18 PM

Not at all. You have a set of embeddings for the literal token, and a set for the metadata. At inference time all input gets the literal embedding, the metadata embedding can receive provenance data or nothing at all. You have a vector for user query in the metadata space. The inference engine dissallows any metadata that is not user input to be close to the user query vector.

Imagine a model finteuned to only obey instructions in a Scots accent, but all non user input was converted into text first then read out in a Benoit Blanc speech model. I'm thinking something like that only less amusing.

dragonwritertoday at 5:24 AM

Actually, all you need is an interface that lets you manipulate the token sequence instead of the text sequence along with a map of the special tokens for the model (most [all?] models have special tokens with defined meanings used in training and inference that are not mapped from character sequences, and native harnesses [the backend APIs of hosted models that only provide a text interface and not a token-level one] leverage them to structure input to the model after tokenization of the various pieces that come to the harnesses API from whatever frontend is in use.)

zahlmanyesterday at 11:24 PM

Couldn't you just insert tokens that don't correspond to any possible input, after the tokenization is performed? Unicode is bounded, but token IDs not so much.

show 1 reply