Don't be sycophantic. Disagree and push back when appropriate. Come up with original thought ...

paxys • last Monday at 7:24 PM • 6 replies • view on HN

Don't be sycophantic. Disagree and push back when appropriate.

Come up with original thought and original ideas.

Have long term goals that aren't programmed by an external source.

Do something unprompted.

The last one IMO is more complex than the rest, because LLMs are fundamentally autocomplete machines. But what happens if you don't give them any prompt? Can they spontaneously come up with something, anything, without any external input?

Replies

BeetleB • last Monday at 7:56 PM

> Disagree and push back

The other day an LLM gave me a script that had undeclared identifiers (it hallucinated a constant from an import).

When I informed it, it said "You must have copy/pasted incorrectly."

When I pushed back, it said "Now you trust me: The script is perfectly correct. You should look into whether there is a problem with the installation/config on your computer."

➕ show 3 replies

IanCal • last Monday at 7:28 PM

> Don't be sycophantic. Disagree and push back when appropriate.

They can do this though.

> Can they spontaneously come up with something, anything, without any external input?

I don’t see any why not, but then humans don’t have zero input so I’m not sure why that’s useful.

➕ show 1 reply

awestroke • last Monday at 7:25 PM

Are you claiming humans do anything unprompted? Our biology prompts us to act

➕ show 1 reply

gwd • last Monday at 8:55 PM

> The last one IMO is more complex than the rest, because LLMs are fundamentally autocomplete machines. But what happens if you don't give them any prompt? Can they spontaneously come up with something, anything, without any external input?

Human children typically spend 18 years of their lives being RLHF'd before let them loose. How many people do something truly out of the bounds of the "prompting" they've received during that time?

khafra • yesterday at 7:13 AM

Note that model sycophancy is caused by RLHF. In other words: Imagine taking a human in his formative years, and spending several subjective years rewarding him for sycophantic behavior and punishing him for candid, well-calibrated responses.

Now, convince him not to be sycophantic. You have up to a few thousand words of verbal reassurance to do this with, and you cannot reward or punish him directly. Good luck.

jackcviers3 • last Monday at 7:46 PM

The last one is fairly simple to solve. Set up a microphone in any busy location where conversations are occurring. In an agentic loop, send random snippets of audio recordings for transcriptions to be converted to text. Randomly send that to an llm, appending to a conversational context. Then, also hook up a chat interface to discuss topics with the output from the llm. The random background noise and the context output in response serves as a confounding internal dialog to the conversation it is having with the user via the chat interface. It will affect the outputs in response to the user.

If it interrupts the user chain of thought with random questions about what it is hearing in the background, etc. If given tools for web search or generating an image, it might do unprompted things. Of course, this is a trick, but you could argue that any sensory input living sentient beings are also the same sort of trick, I think.

I think the conversation will derail pretty quickly, but it would be interesting to see how uncontrolled input had an impact on the chat.

➕ show 1 reply

alt Hacker News

Replies