I don’t blame anthropic here. The government literally threatened their existence publicly. They either agreed or their business would be nationalized.
This headline unfortunately offers more smoke than light. This article has nothing to do with the current tête-à-tête with the Pentagon. It is discussing one specific change to Anthropic's "Responsible Scaling Policy" that the company publicly released today as version "3.0".
What an interesting week to drop the safety pledge.
This is how all of these companies work. They’ll follow some ethical code or register as a PBC until that undermined profits.
These companies are clearly aiming at cheapening the value of white collar labor. Ask yourself: will they steward us into that era ethically? Or will they race to transfer wealth from American workers to their respective shareholders?
First they rushed a model to market without safety checks, and I said nothing. It wasn't my field.
Then they ignored the researchers warning about what it could do, and I said nothing. It sounded like science fiction.
Then they gave it control of things that matter, power grids, hospitals, weapons, and I said nothing. It seemed to be working fine.
Then something went wrong, and no one knew how to stop it, no one had planned for it, and no one was left who had listened to the warnings.
TBH I am sad that Anthropic is changing its stance, but in the current world, if you even care about LLM safety, I feel that this is the right choice — there’s too many model providers and they probably don’t consider safety as high priority as Anthropic. (Yes that might change, they can get pressurized by the govt, yada yada, but they literally created their own company because of AI safety, I do think they actually care for now)
If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil), and that might mean releasing models that are safer and more steerable than others (even if, unfortunately, they are not 100% up to Anthropic’s goals)
Dogmatism, while great, has its time and place, and with a thousand bad actors in the LLM space, pragmatism wins better.
It must be due to pressure from the Defense Dept:
The AI startup has refused to remove safeguards that would prevent its technology from being used to target weapons autonomously and conduct U.S. domestic surveillance.
Pentagon officials have argued the government should only be required to comply with U.S. law. During the meeting, Hegseth delivered an ultimatum to Anthropic: get on board or the government would take drastic action, people familiar with the matter said.
https://www.staradvertiser.com/2026/02/24/breaking-news/anth...
Was this because they were threatened with a fine?
The IPOs this year can't come soon enough https://tomtunguz.com/spacex-openai-anthropic-ipo-2026/
Of course the US is going to do this and of course its in Anthropics best interest to comply. Right now China is flooding HuggingFace with models that will inevitably have this capability. Right now there are hundreds of models being hosted that have been deliberately processed to remove refusals and their safety training. Everyone who keeps up with this knows about it. HF knows about it. And it is pretty obvious that those open weight models will be deployed in intelligence and defense. It is certain that not just China, but many nations around the world with the capital to host a few powerful servers to run the top open weight models are going to use them for that capability.
The narrative on social media, this site included, is to portray the closed western labs as the bad guys and the less capable labs releasing their distilled open weight models to the world as the good guys.
Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.
But let's worry about what the US DoD is doing or what the western AI companies absolutely dominating the market are doing because that's what drives engagement and clicks.
Really - each country needs its own sovereign AI infrastructure and models. Sigh.
At some point, all of these big names in AI (OpenAI, Anthropic, Mistral, etc ...) will have to disclose their actual financials.
And it will be, as Warren Buffet puts it, a "Only when the tide goes out do you discover who's been swimming naked." moment.
> committed to never train an AI system unless it could guarantee in advance that the company’s safety measures were adequate
That doesn't even make sense.
What stops one model from spouting wrongthink and suicide HOWTOs might not work for a different model, and fine-tuning things away uses the base model as a starting point.
You don't know the thing's failure modes until you've characterized it, and for LLMs the way you do that is by first training it and then exercising it.
Either be a company in capitalist USA, or keep being your safety queen. You just can’t be both.
The intention to start these pledge and conflict with DOW might be sincere, but I don’t expect it to last long, especially the company is going public very soon.
So much BS from this Anthropic company. They have a good product but just too much slope PR. It’s like they want you to hate them. I can’t stand their “safety” and national security crap when they talk about how open source models are so bad for everyone.
It was always a matter of time
Unsurprising.
I just want Apple and Linux to offer ASAP:
1. Extremely granular ways to let user control network and disk access to apps (great if resource access can also be changed)
2. Make it easier for apps as well to work with these
3. I would be interested in knowing how adding a layer before CLI/web even gets the query OS/browser can intercept it and could there be a possibility of preventing harm before hand or at least warning or logging for say someone who overviews those queries later?
And most importantly — all these via an excellent GUI with clear demarcations and settings and we’ll documented (Apple might struggle with documentation; so LLMs might help them there)
My point is — why the hell are we waiting for these companies to be good folks? Why not push them behind a safety layer?
I mean CLI asks .. can I access this folder? Run this program? Download this? But they can just do that if they want! Make them ask those questions like apps asks on phones for location, mic, camera access.
Related:
Hegseth gives Anthropic until Friday to back down on AI safeguards
I don't understand how safety is taken seriously at all. To be clear, I'm not referring to skepticism that these companies can possibly resist the temptation to make unsafe models forever. No, I'm talking about something far more basic: the fact that for all the talk around safety, there is very little discussion about what exactly "safety" means or what constitutes "ethical" or "aligned" behavior. I've read reams of documents from Anthropic around their "approach to safety". The "Responsible Scaling Policy," Claude's "Constitution". The "AI Safety Level" framework. Layer 1, Layer 2.
It's so much focus on implementation, and processes, and really really seems to consider the question of what even constitutes "misaligned" or "unethical" behavior to be more or less straight forward, uncontroversial, and basically universally agreed upon?
Let's be clear: Humans are not aligned. In fact, humans have not come to a common agreement of what it means to be aligned. Look around, the same actions are considered virtuous by some and villainous by others. Before we get to whether or not I trust Anthropic to stick to their self-imposed processes, I'd like to have a general idea of what their values even are. Perhaps they've made something they see as super ethical that I find completely unethical. Who knows. The most concrete stances they take in their "Constitution" are still laughably ambiguous. For example, they say that Claude takes into account how many people are affected if an action is potentially harmful. They also say that Claude values "Protection of vulnerable groups." These two statements trivially lead to completely opposing conclusions in our own population depending on whether one considers the "unborn" to be a "vulnerable group". Don't get caught up in whether you believe this or not, simply realize that this very simple question changes the meaning of these principles entirely. It is not sufficient to simply say "Claude is neutral on the issue of abortion." For starters, it is almost certainly not true. You can probably construct a question that is necessarily causally connected to the number of unborn children affected, and Claude's answer will reveal it's "hidden preference." What would true neutrality even mean here anyways? If I ask it for help driving my sister to a neighboring state should it interrogate me to see if I am trying to help her get to a state where abortion is legal? Again, notice that both helping me and refusing to help me could anger a not insignificant portion of the population.
This Pentagon thing has gotten everyone riled up recently, but I don't understand why people weren't up in arms the second they found out AIs were assisting congresspeople in writing bills. Not all questions of ethics are as straight forward as whether or not Claude should help the Pentagon bomb a country.
Consider the following when you think about more and more legislation being AI-assisted going forward, and then really ask yourself whether "AI alignment" was ever a thing:
1. What is Claude's stances on labor issues? Does it lean pro or anti-union? Is there an ethical issue with Claude helping a legislator craft legislation that weakens collective bargaining? Or, alternatively, is it ethical for Claude to help draft legislation that protects unions?
2. What is Claude's stance on climate change? Is it ethical for Claude to help craft legislation that weakens environmental regulations? What if weakening those regulations arguably creates millions of jobs?
3. What is Claude's stance on taxes? Is it ethical for Claude to help craft legislation that makes the tax system less progressive? If it helps you argue for a flat tax? How about more progressive? Where does Claude stand on California's infamous Prop 19? If this seems too in the weeds, then that would imply that whether or not the current generation can manage to own a home in the most populous state in the US is not an issue that "affects enough people." If that's the case, then what is?
4. Where does Claude land on the question of capitalism vs. socialism? Should healthcare be provided by the state? How about to undocumented immigrants? In fact, how does Claude feel about a path to amnesty, or just immigration in general?
Remember, the important thing here is not what you believe about the above questions, but rather the fact that Claude is participating in those arguments, and increasingly so. Many of these questions will impact far more people than overt military action. And this is for questions that we all at least generally agree have some ethical impact, even if we don't necessarily agree on what that impact may be. There is another class of questions where we don't realize the ethical implications until much later. Knowing what we know now, if Claude had existed 20 years ago, should it have helped code up social networks? How about social games? A large portion of the population has seemingly reached the conclusion that this is such an important ethical question that it merits one of the largest regulation increases the internet has ever seen in order to prevent children from using social media altogether. If Claude had assisted in the creation of those services, would we judge it as having failed its mission in retrospect? Or would that have been too harsh and unfair a conclusion? But what's the alternative, saying it's OK if the AI's destroy society... as long as if it's only on accident?
What use is a super intelligence if it's ultimately as bad at predicting unintended negative consequences as we are?
This is terrible. It’s caving in to the Trump administration threatening to ban Anthropic from government contracts. It really cements how authoritarian this administration is and how dangerous they can be.
[flagged]
Ah, the classic AI startup lifecycle:
We must build a moat to save humanity from AI.
Please regulate our open-source competitors for safety.
Actually, safety doesn't scale well for our Q3 revenue targets.