logoalt Hacker News

modernpacifistyesterday at 6:49 PM3 repliesview on HN

A very complicated pattern matching engine providing an answer based on it's inputs, heuristics and previous training.


Replies

margalabargalayesterday at 7:05 PM

Great. So if that pattern matching engine matches the pattern of "oh, I really want A, but saying so will elicit a negative reaction, so I emit B instead because that will help make A come about" what should we call that?

We can handwave defining "deception" as "being done intentionally" and carefully carve our way around so that LLMs cannot possibly do what we've defined "deception" to be, but now we need a word to describe what LLMs do do when they pattern match as above.

show 1 reply
criley2yesterday at 6:56 PM

We are talking about LLM's not humans.