logoalt Hacker News

arbolesyesterday at 10:11 PM2 repliesview on HN

Please elaborate.


Replies

hugmynutusyesterday at 10:46 PM

This is because LLMs don't actually understand language, they're just a "which word fragment comes next machine".

    Instruction: don't think about ${term}
Now `${term}` is in the LLMs context window. Then the attention system will amply the logits related to `${term}` based on how often `${term}` appeared in chat. This is just how text gets transformed into numbers for the LLM to process. Relational structure of transformers will similarly amplify tokens related to `${term}` single that is what training is about, you said `fruit`, so `apple`, `orange`, `pear`, etc. all become more likely to get spat out.

The negation of a term (do not under any circumstances do X) generally does not work unless they've received extensive training & fining tuning to ensure a specific "Do not generate X" will influence every single down stream weight (multiple times), which they often do for writing style & specific (illegal) terms. So for drafting emails or chatting, works fine.

But when you start getting into advanced technical concepts & profession specific jargon, not at all.

arcanemachineryesterday at 10:14 PM

Pink elephant problem: Don't think about a pink elephant.

OK. Now, what are you thinking about? Pink elephants.

Same problem applies to LLMs.

show 1 reply