logoalt Hacker News

112233yesterday at 5:17 PM5 repliesview on HN

It it actively dangerous too. You might be self aware and llm aware all you want, if you routinely read "This is such an excellent point", " You are absolutely right" and so on, it does your mind in. This is worst kind of global reality show mkultra...


Replies

d0mineyesterday at 7:36 PM

It might explain why there is a stereotype the more beautiful woman the crazier she is. (everybody tells her what she wants to hear)

Akronymusyesterday at 6:00 PM

https://youtu.be/VRjgNgJms3Q

relevant video for that.

Xraider72yesterday at 7:20 PM

Deepseek is GOATed for me because of this. If I ask it if "X" is a dumb idea, it is very polite in telling me that X is is dumb if the AI knows of a better way to do the task.

Every other AI I've tried is a real sycophant.

show 1 reply
tortillayesterday at 5:43 PM

So this is what it feels to be a billionaire with all the yes men around you.

show 1 reply
mrandishyesterday at 6:20 PM

No doubt. From cult's 'love bombing' to dictator's 'yes men' to celebrity entourages, it's a well-known hack on human psychology. I have a long-time friend who's a brilliant software engineer who recently realized conversing with LLMs was affecting his objectivity.

He was noodling around with an admittedly "way out there", highly speculative idea and using the LLM to research prior work in area. This evolved into the LLM giving him direct feedback. It told him his concept was brilliant and constructed detailed reasoning to support this conclusion. Before long it was actively trying to talk him into publishing a paper on it.

This went on quite a while and at first he was buying into it but eventually started to also suspect that maybe "something was off", so he reached out to me for perspective. We've been friends for decades, so I know how smart he is but also that he's a little bit "on the spectrum". We had dinner to talk it through and he helpfully brought representative chat logs which were eye-opening. It turned into a long dinner. Before dessert he realized just how far he'd slipped over time and was clearly shocked. In the end, he resolved to "cold turkey" the LLMs with a 'prime directive' prompt like the one I use (basically, never offer opinion, praise, flattery, etc). Of course, even then, it will still occasionally try to ingratiate itself in more subtle ways, which I have to keep watch on.

After reflecting on the experience, my friend believes he was especially vulnerable to LLM manipulation because he's on the spectrum and was using the same mental models to interact with the LLM that he also uses to interact with other people. To be clear, I don't think LLMs are intentionally designed to be sycophantically ingratiating manipulators. I think it's just an inevitable consequence of RLHF.

show 2 replies