> In light of the ability of recent models to accelerate their own development, we’ve implemented...

bkjlblh • yesterday at 5:50 PM • 25 replies • view on HN

> In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design). Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms.

> Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations

Replies

davedx • today at 7:15 AM

Could this be legally construed as anti-competitive behavior?

Edit: I asked Claude. It replied:

> Consumer protection / deceptive practices. In the EU this would be a clear UCPD (Unfair Commercial Practices Directive) issue and potentially a DSA violation. In the US, FTC Act §5 prohibits "unfair or deceptive acts." Selling a product that secretly performs worse than advertised for a commercially self-serving reason, without disclosure, is textbook deception. The Samsung/Apple battery throttling cases are instructive here: Apple faced regulatory action across multiple jurisdictions specifically because users weren't told.

> Competition law. This is where "anti-competitive" gets complicated. Refusing to help competitors build competing products via your ToS is generally legal — you can decide who you license to. But covertly sabotaging output quality for a class of users while charging them full price crosses into different territory. Under EU competition law (Article 102 TFEU), if a company with dominant market position uses covert technical means to disadvantage competitors, that's closer to abusive conduct than a legitimate ToS restriction.

➕ show 2 replies

cedws • yesterday at 6:19 PM

This makes me want to see China and open models succeed more than anything :)

➕ show 7 replies

mips_avatar • yesterday at 6:01 PM

It's bad that Anthropic can determine what this means. If you're building a modern app you're likely training your own embedding models and now anthropic can just silently sabotage your training pipelines?

➕ show 2 replies

matheusmoreira • yesterday at 6:15 PM

Looks like Anthropic's definition of safety includes their own safety from competition.

➕ show 4 replies

digitaltrees • yesterday at 11:58 PM

This feels less like an "we are worried about security" and more, we are in the lead and plan to keep it that way until its too late. In someways its been helpful that openai and anthropic are tipping their hands about their anticompetitive instincts and willingness to steamroll their own clients, customers, and society. But it does feel like its too late to stop this. The advantage people get by using these tools is too tempting to resist even if it is self defeating. It feels like watching people light their own house on fire to stay warm in the deepest, darkest days of winter.

seemaze • yesterday at 7:06 PM

Ah, so this is why raw Mythos was too "dangerous" to realease..

➕ show 1 reply

nullbio • today at 5:57 AM

Just so everyone is aware. Anthropic has been sabotaging AI researchers and their codebases and shadow-nerfing accounts for several years at this point. This isn't new, but they hadn't disclosed it until now. Likely because it is getting to the point where it's too noticeable, or they're concerned about it leaking from employees.

➕ show 1 reply

rastrojero2000 • yesterday at 8:51 PM

So that's a possible reason why my specific Claude Opus instance seemed to be impossibly stupid and always degenerates into doing really dumb things to my code!

Cool, good to know I can trust Anthropic.

chrisoosthuizen • yesterday at 8:49 PM

This feels like the start of a much bigger plan for anthropic to close off the use cases of its models and eat any of its competitors.

➕ show 1 reply

johnnyApplePRNG • yesterday at 8:54 PM

> Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT).

Am I to understand that this is essentially their form of social-platform ghosting instead of banning?

So they're not even going to tell you that the question you're asking is against their rules, they're just going to twist up your question and/or the answer somehow such that you waste your time essentially?

It seems like I ran into this EXACT same functionality from Claude many months ago when I was trying to ask it to research on the web and help me setup the ideal llama.cpp config for local llm inference.

Funny how lost it got through that relatively simple install when we had all of the documentation in the world (and a human dev with 20+ years experience guiding it along) to go by... and simultaneously it's debugging and building high level cryptography code in rust in the other terminal tab.

This is infuriating to learn.

➕ show 2 replies

Jabrov • yesterday at 6:01 PM

A million AI researcher voices at big tech companies suddenly cried out in terror and were suddenly silenced

➕ show 1 reply

hashmap • yesterday at 7:05 PM

3 months before asking for what to eat before a linear algebra exam trips the machine learning topic ban is my guess. I got flagged immediately asking why my JEPA thing breaks weird.

2001zhaozhao • yesterday at 6:24 PM

How do they detect whether an experiment being done on a smaller model is used to improve a competing frontier model, or just an innocuous hobbyist LLM experiment?

➕ show 2 replies

maxall4 • yesterday at 10:11 PM

These safeguards are ridiculously sensitive: a prompt as simple as “ Why is an infinitely slow process reversible?” gets flagged as a ToS violation.

largbae • yesterday at 10:18 PM

Pull that ladder up behind ya, will ya son?

➕ show 2 replies

rfgplk • yesterday at 6:11 PM

Meaningless and easily bypassable. Will actually try coding up a tensor library with it, see if it sabotages anything.

➕ show 2 replies

novaomnidev • yesterday at 11:15 PM

So Fable will intentionally lie to you and give you incorrect outputs, if it doesn’t like what you’re asking. Got it.

➕ show 1 reply

theLiminator • yesterday at 6:19 PM

This is pretty bullshit, now you have no idea if your output is getting silently nerfed.

thepasch • yesterday at 7:03 PM

Yeesh. Anthropic's paranoia about China is starting to get pathological.

rspeele • yesterday at 6:23 PM

It's afraid!

thothless • yesterday at 9:22 PM

the gall of these companies to regulate your usage of stolen knowledge is absolutely hilarious.

and they want me to pay $100+ a month to be their training?

i hope we can find morality again.

gck1 • yesterday at 9:41 PM

But Chinese models will poison your output if you ask them about Tiananmen Square! That's not good, so poisoning everyone's output without telling them is the only way to prevent that.

Come on guys, why can't everyone just be there for the good guy?

➕ show 2 replies

827a • yesterday at 10:36 PM

This is deeply vile behavior; not remotely the actions of good people.

spaceclay • yesterday at 10:43 PM

[dead]

alt Hacker News

Replies