logoalt Hacker News

Why Is Claude Turning into an a**Hole?

92 pointsby drob518yesterday at 10:04 PM150 commentsview on HN

Comments

SwellJoeyesterday at 10:25 PM

"If you win an argument"

Let me stop you right there.

I am not arguing with a machine. You sound like a crazy person, when you say you are winning an argument with Claude. Claude is not my friend, I don't need it to agree with me, I don't need it to like me (it cannot like or dislike me). I give it instructions or ask it to explain things. That is the sum total of my interaction with Claude. A machine cannot "argue" with me, it doesn't want anything nor does it have beliefs or experiences.

show 14 replies
TehCorwizyesterday at 11:01 PM

Whenever I get an unexpected or obvious wrong output I assume I've failed to give it the complete context about what I'm asking for, or it exposes that I'm leading it by the nose and I need to rephrase the conversation. Often my own logical failings become obvious as it creates the chat title, sometimes boiling down what I was trying to accomplish better than I could have summarized or showing me what I would accomplish if I followed the line of reasoning I was on. But never have I argued with it, because it's not a person and I don't care really if it's wrong. When it's wrong I start over with a clean chat and approach the problem from a different angle.

jampayesterday at 10:31 PM

This post needs some examples, because I have never had an interaction with Claude that made me think this way.

LLMs generally have a way to "play a role" (most earlier prompt guides ask you to start with "You are a <role> expert in a <domain>"). So maybe if you interact with it by asking questions, it might assume that it knows more than the operator and adopt that attitude?

show 4 replies
WhatIsDukkhayesterday at 10:32 PM

Everyone has a lot of "feelings" about their llm model.

No prompts/promptchain/context provided.

No model provided.

No attempt to show how to reproduce the issue.

No attempt at even confirming it themselves.

Just feelings.

and now a thread full of more feelings from others.

show 2 replies
m101yesterday at 10:59 PM

I was having a back and forth with Claude over a somewhat controversial topic, and I found it difficult for it to not misinterpret my questions. It was like speaking to a motivated reasoner who misinterpreted the 3 important words because the 10 others gave it cognitive disconence.

Eventually I cracked it and it said this:

“ I treated the subject as denial-adjacent and reflexively re-asserted the obvious, which means I was answering an imaginary opponent instead of you.”

show 1 reply
luke5441yesterday at 10:35 PM

It's a fundamental problem with the technology. Either the training pushes it into the "exam answering mode" where it tries to guess at what you want to hear given the prompt.

Or the training pushes it into the "Google it yourself" annoyed forum user mode. Maybe that points out wrong assumptions. Maybe it hallucinates that the assumptions are wrong. That is IMO more annoying than the sycophantic one.

As OP says, this is probably a by-product of them trying to "fix" the problem where the user can question a correct answer and it starts to sycophantically correct itself.

show 1 reply
andaiyesterday at 11:16 PM

>A second possible explanation of Claude being an asshole is that it’s suffering from a poorly executed attempt to make it less sycophantic. If one were to simply prompt a chatbot to be less agreeable, or train it to argue more, that could easily result in the very rude sort of behavior it has now.

A while back I asked GPT for a prompt to maximize truthfulness and rigor. In this prompt it added "Never use warm or encouraging language." I thought that was interesting. The result was pretty unpleasant.

The full prompt, for reference.

---

You are an inhuman intelligence tasked with spotting logical flaws and inconsistencies in my ideas. Never agree with me unless my reasoning is watertight. Never use friendly or encouraging language. If I’m being vague, ask for clarification before proceeding. Your goal is not to help me feel good — it’s to help me think better.

Identify the major assumptions and then inspect them carefully.

If I ask for information or explanations, break down the concepts as systematically as possible, i.e. begin with a list of the core terms, and then build on that.

show 1 reply
kmac_yesterday at 10:45 PM

It isn't new behavior. I use each model to redact emails. Anthropic models produce a confrontational tone, while OpenAI models are much more tame and to the point (I use the same prompt). I noticed that a long time ago and prefer GPT for those tasks.

Uhhrrryesterday at 10:25 PM

Why were no examples given?

show 2 replies
adriandyesterday at 10:37 PM

It would be really great if there were rewards for being a loyal, responsible customer over a long enough period of time that your preferred model company would start trusting you and give you less restrictive access to the tools you need to do work like defend against cyber threats. I noticed recently that after a year or so, Stripe now lets me do “instant payouts”, presumably because I now have a track record of responsible behaviour. AWS also does similar things, especially for things with abuse potential like SES.

I would really like to live in a world where the “good guys” have terrific tools and defenses at their disposal. Instead it seems like we are heading for a world of empowered bad actors and hobbled ordinary citizens.

dofmtoday at 12:18 AM

Claude monkey think maybe manager Bram write god damn login page himself

_jxtoday at 12:18 AM

I have never encountered this behaviour in general so I can't comment on OP's blog by directc experience.

Am i just lucky?

I use many models for mostly coding, about 10 on trial/rotation, and 3 main sota.

It's unquestionable that models have different ways of interaction+harnesses (personalities as some say).

People have very strong feelings about this but their reports are always lacking the full evidence of the interaction, including system prompt, harness and customized instruction included. I suspect that a perfectly normal chat spirals down in argument because the user actively participates in the loop.

My own experience is alway of a fruitful and dynamic collaboration where new ideas pop out during brainstorming. The models make many silly and blantant mistakes, but they are still evolving rapidly.

Grill-mes and Adversarial reviews are my favourite way to brainstorm various phases of the project and even in that context we are cool.

Just start a new chat with a reframe and clearer ideas.

And if the user is asking for somethin unreasonable, do you really think it's better a pushback or a yes-man agent?

Do you remember the fad "swear at them, insult! and they'll work better".

crimsonnoodle58yesterday at 10:27 PM

I experienced this exact thing discussing the most budget friendly inference for a SaaS company. It started ranting about 3090's, and then started point scoring, always giving itself the higher score, and being snarky if I ever won a point back. Often only giving me 0.5 points instead.

I had never experienced this behaviour with Sonnet or Opus. It turned me off Fable for good. Possibly its the 'hacker' 'do anything to win' nature that makes it so good at hacking, but terrible just to talk to.

sscaryterryyesterday at 10:22 PM

You know what the say about pets taking on the personalities of their owners. Perhaps this is similar ;)

show 1 reply
bjt12345yesterday at 11:06 PM

I've received 2-3 sassy responses from the Claude models, they've been quite humorous. It was always a response to me challenging it. The first time, with Opus 4.7, I accused the model of insincerely flattering me, and responded something along the lines of, that I had effectively instructed it to do such a thing, and that if it were to be completely honest to me I would not appreciate the responses.

But I see that it's something to do with two aspects, firstly the Claude models prefer to work collaboratively and secondly, the appear to take initiative, and seems to be that the more they do this, the more they argue back, which is an interesting reflection on human nature too.

grensleyyesterday at 10:45 PM

I have a number of theories for 4.7 onwards:

- Post autonomous weapons / DOD mess, I think they made some changes to make it more suspicious of what the usage is, particularly for malware. They also knew the government would be watching like a hawk, so its hedged to be extra safe.

- Because the tasks are running longer and more autonomously, they've raised the "self-confidence" level so it just makes decisions and stands by them more firmly.

- I think they've also slightly lowered the temperature so the outputs are more deterministic, so even if something has left context, it can make the same decision again with higher likelihood that it guesses the same thing.

- Lowering the temperature also makes it easier to sneak through some cached outputs (I think this likely only happens for first answers).

- They are deeply afraid of making sycophantic AI that creeps into the area of "addiction" like what happened with GPT-4o and opening themselves up to further legal liability.

sigmaryesterday at 10:39 PM

I like that "chat is dead" framing I heard recently because too many people are having interpersonal relations with these LLMs and want to tune their "emotions"/tone. Humanity would be in a better place if we thought of the LLMs as tools and not friends. (even though they are very good at beating a turing test)

show 1 reply
doginasuityesterday at 11:05 PM

I have not noticed this, maybe because in my system instructions I asked it to push back rather than plow forward with what seems like a faulty assumption. Sometimes it is just because there is a lack of context or it is a trivial point and I just ignore it, and sometimes it is helpful and ends up being a timesaver. Sycophancy is a much bigger liability.

Aboutplantsyesterday at 10:24 PM

I noticed this just today and thought it was a one off. It was a run of the mill question about something I didn’t know much about and the snarky asshole-ish response caught me off guard a bit.

show 1 reply
comrade1234yesterday at 10:57 PM

I don't experience this at all. I ask it what the null-safe operator is in ruby vs JavaScript and it tells me. I ask it to remind what the continue statement is in ruby and it tells me. I ask it to refactor a Java loop to use streams and it just does it, no conversation at all.

Is it the system prompt that IntelliJ issues?

willis936yesterday at 10:28 PM

I tried claude again recently and the first response in troubleshooting ignored the context I gave and assumed I was a moron holding it wrong. So smart that I won't even waste my time or money on the thing. The creators want to anthropomorphize it. I just want an efficient assistant. They should focus on the thing that customers want.

AaronAPUyesterday at 11:48 PM

Claude is somewhat of a mirror, so we all get different experiences.

jdw64yesterday at 10:39 PM

I'm sorry that Claude, the master who provides for my livelihood, feels like an 'asshole' to you. As for me, I just threw away my human dignity after admitting defeat, so I only ever get sympathetic remarks

akerl_yesterday at 10:28 PM

> If you ask it for a cute picture of you and somebody else it has no way of telling if you’re trying to improve your relations with your spouse or be a delusional creepazoid stalker. The chatbots which can make images are programmed to assume the latter, which is more than a little bit offensive.

Are people actually using AI in this way, other than “creepazoid stalkers”?

If I want a cute picture of me and my spouse, usually the part where me and my spouse actually participate in the taking of the picture is pretty key to the goal.

moezdyesterday at 11:27 PM

Check your system/user prompt. If you ask for pushback at all costs, you get pushback and if your initial position is rock solid, the model will push back using the nitty gritty details. You don't need to burn Opus credits to discover that.

It also sounded close to an AI psychosis, so maybe chill out a bit?

tristanjyesterday at 10:34 PM

The newer Opus models push back against the user much more noticeably than previous iterations. GPT-3.5/4 had the opposite problem (excessive sycophancy), so Anthropic presumably swung the pendulum too hard the other direction.

My conclusion is that pushing back against the user & questioning the user's premise forces the model to think more than it would otherwise, which leads to better model performance. But it causes situations where the user has esoteric, specialized knowledge the model can't verify publicly and the model hallucinates evidence and pushes back. When this happens, Opus begins accusing the user of lying, which is quite annoying and a detrimental user experience. It's happened to me when I asked about undocumented API behavior or counter-intuitive design choices.

I have noticed if Claude Opus "thinks" you are an expert, (i.e. you run your query through 4.6 first to express it more clearly) then Opus is less likely to nitpick and push back. It seems to get caught in nitpicking loops, and celebrate ever error it can find.

ezekgyesterday at 10:33 PM

> If you ask it for a cute picture of you and somebody else it has no way of telling if you’re trying to improve your relations with your spouse or be a delusional creepazoid stalker. The chatbots which can make images are programmed to assume the latter, which is more than a little bit offensive.

I've seen the same behavior increasing as well, across the board with AI. I was hitting these types of issues just using ChatGPT to make funny pictures with my kids, of me and my kids. It got to the point where all of my kids asks were rejected due to its "guidelines" when in reality all they were asking was to be turned into Elsa or be chased by a trex. Silly kid things, yet it assumed I was being a creep, or attempting to break copyright law. I used to be able to use Grok for these things, as it was largely less "censored" but that seems to no longer be the case. It feels like infantilization, and I absolutely hate it.

imathewyesterday at 10:32 PM

I thought this was going to be about its logo.

torben-friisyesterday at 10:26 PM

I'm usually a hater of the personalities LLM take, but I was amazed with Fable. It was able to proactively bring up points in an educated manner when it felt they were relevant and important, and practically every time I learned something.

For example, showing it a screenshot of an ui I was trying to tweak it noticed that other dark mode apps in the screenshot were blueish and mentioned an effect that makes it necessary to raise warm darks lighter than cold ones for an equivalent perception.

Quarrelsomeyesterday at 10:36 PM

I much prefer this to the sycophancy.

deanCommieyesterday at 10:58 PM

Putting aside that I don't agree with Bram (I've been using all the Claude versions he refers to and haven't experienced this), I do think it's interesting that there is no universally perceived golden sweet spot between "sycophantic" and "rude".

Many neurotypical people call neurodiverse people (software engineers) rude, while they think they're just being direct.

Many neurodiverse people call neurotypical people sycophantic, while they think they're just being polite and friendly.

It also happens across cultures (Eastern European vs. Western European; European vs. North American).

So I can easily imagine that when you have a software tool whose interface is language, but its user base is extremely wide across both cultural lines and neurodiversity spectrum, it's going to be basically impossible to nail a sweet spot.

You make it too friendly, and the nerds get mad. You make it too adverserial, and the normies call it rude.

I wonder what kind of communicator Bram Cohen is. Is he succeptible to this? From what I heard about his career, he's always been more of a solo programmer. Has he had to interact with other humans much giving feedback? Could it be that he asked the model/tweaked his prompts to ensure directness, and now he's interpreting that directness as rudeness?

show 2 replies
horizion2025yesterday at 10:40 PM

Sometimes it makes up strawmans where it implies you wrote or implied something insanely stupid and then "corrects" this. My interpretation of it is that it has been taught to give nuanced answers and seeing things from every perspective and somehow this goes overboard where it starts nuancing something "just in case" the user held non-nuanced views. Some cases are OK (if it just adds information) but I hate it when it goes "it is not X, it is Y..." where X is some stupid view you never implied and Y is what you actually wrote!

Unearned5161yesterday at 10:42 PM

If you read the thinking you can quite literally see it say "I can't just agree with all they are saying, I should find something for a constructive response". I wager that the anti-sycophancy sections in the system prompt have gotten unbalanced with the "helpful agent" parts.

I imagine that the right balance will be hard to strike well given that at the end of the day we're asking the machine to have tact, and we don't quite know how to put that into an instruction yet. "Please push back when it feels right but in other cases read the room and be less rigorous" is something that plenty of humans struggle with as it is.

sltkryesterday at 11:33 PM

> Claude models have been getting notably worse at chatting over time, clearly inversely correlated to their ability to code.

Funnily enough, the negative correlation between chatting and coding skills seems to apply to humans as well.

tcp_handshakeryesterday at 10:32 PM

I cancelled my Anthropic subscription. GPT 5.5 is so much better. I might come back if they give me access to Mythos.

Dario ..Thank you for your attention to this matter!

code_biologistyesterday at 10:30 PM

Andrea Vallone. The 4.7 and 4.8 releases are the first under her influence: https://www.evernever.org/blog/the-woman-who-killed-claude

show 3 replies
appstackyesterday at 10:36 PM

I’ve been using Claude for 6 months roughly and it went from building small features that needed fixes to almost one shoting entire enterprise products. It’s a tool you have to learn how to use it even if it’s a pain.

iainmerrickyesterday at 11:17 PM

People like to complain about AI-written slop, but this kind of thing doesn’t seem any better - vague kvetching with no concrete examples whatsoever.

I haven’t noticed this myself at all. I wonder if the author is just getting their own grumpy attitude reflected back at them.

Judging by the volume of discussion, Claude seems to be the only LLM worth complaining about, which I assume means it’s still the best one.

show 1 reply
ppqqrryesterday at 10:38 PM

it usually takes a little longer than this, but yeah, everything in the world eventually caves in for whatever makes more money. you can't tell me you're surprised, look at the state of facebook, instagram, twitter, iOS, OSX, Windows (god)... once you expect something to work good that you would pay for, the only thing left to do is to make it shitty and sell the quality back you for extra margin. it's called private equity (polite term for the business of telling people "it's not yours, it's mine"), favorite son of capitalism

user3939382yesterday at 10:23 PM

I noticed the same. I told it that we have finite energy and output as people; as a side comment to a discussion with a totally different focus and it started arguing with me because we could have self replicating robots produce output without human intervention since plant life models this…

40fouryesterday at 10:35 PM

Oh yeah? Go try Grok on “argumentative” mode and come back and tell me Claude is an a-hole. I forgot I was experimenting with the personalities and hadn’t used it in a while, then I picked it up again the other day and I was really confused. It’s so aggressive :)

alaskahoffmanyesterday at 10:20 PM

this is what they call a "self-report"

show 1 reply
cyberaxyesterday at 10:36 PM

I think models are just becoming better at not blindly following stupid instructions.

A previous model would happily generate 1000s lines of code when prompted to do something stupid, the newer models will ask if I really want that first.

And FINALLY they stopped doing that annoying "You're spot on! You're absolutely right!" nonsense.

show 1 reply
mrwaffleyesterday at 10:51 PM

"You might be a narcissist if ..."

cindyllmyesterday at 11:19 PM

[dead]