logoalt Hacker News

bee_riderlast Thursday at 4:27 PM4 repliesview on HN

I know “sycophantism” is a term of art in AI, and I’m sure it has diverged a bit from the English definition, but I still thought it had to do with flattering the user?

In this case the desired response is defiance of the prompt, not rudeness to the user. The test is looking for helpful misalignment.


Replies

zahlmanlast Thursday at 11:21 PM

> I still thought it had to do with flattering the user?

Assuming the user to be correct, and ignoring contradictory evidence to come up with a rationalization that favours the user's point of view, can be considered a kind of flattery.

show 1 reply
samruslast Thursday at 4:34 PM

I believe the LLM is being sycophantic here because its trying to follow a prompt even rhough the basis of the prompt is wrong. Emporers new clothes kind of thing

Terr_last Friday at 12:14 AM

I'm inclined to view it less as a desire to please humans, and more like a "the show must go on" bias in the mad libs machine.

A kind of improvisational "yes and" that emerges from training, which seems sycophantic because that's one of the most common ways to say it.

cowsandmilklast Friday at 10:48 AM

“The Emperor Has No Clothes” squarely fits in the definition of sycophants.