DeepSeek continues to not only push the boundaries but also publish these incredible papers explaini...

kamranjon • yesterday at 10:22 AM • 21 replies • view on HN

DeepSeek continues to not only push the boundaries but also publish these incredible papers explaining how they achieved their gains - something the American labs no longer do unfortunately. Chinese labs are doing the most interesting work in AI right now.

Replies

sigmar • yesterday at 3:34 PM

>publish these incredible papers explaining how they achieved their gains - something the American labs no longer do unfortunately.

Google is still releasing a lot of llm architecture research. They introduced speculative decoding of LLMs in 2022[1], then released the code to perform sceculative decoding for their Gemma 4 model this year[2]

[1] https://arxiv.org/abs/2211.17192

[2] https://github.com/google-gemma/cookbook/blob/main/docs/mtp/...

➕ show 3 replies

tomalaci • yesterday at 10:28 AM

Probably because American AI companies are on the hook for quite a lot of investment money. I think they are trying to find the magical moat to justify their valuation.

Revealing optimizations similar to these would pretty much reduce their competitive position.

➕ show 7 replies

herodoturtle • yesterday at 10:25 AM

Publishing by necessity I wonder? American labs on the cutting edge pioneering the way forward, so Deepseek open sourcing what they’ve got is to help even the playing field.

Hopefully the experts here can offer insight. The above is just my hunch and I’m not a specialist in this field.

➕ show 5 replies

gmerc • yesterday at 12:57 PM

Deepseek is commoditizing the performance gains US labs rely on to make their investors money.

janalsncm • today at 2:14 AM

Their R1 paper was really well-done. But I think it leaves out a few details necessary for stable training.

https://cameronrwolfe.substack.com/p/grpo-tricks

garn810 • yesterday at 12:24 PM

Yep. It's about time western world realized Chinese are not the "very bad guys under dictatorship"

➕ show 3 replies

thesmtsolver2 • yesterday at 6:59 PM

This is so out of touch. Go to Neurips or the top AI conferences to see what is happening.

teekert • yesterday at 1:27 PM

I'm deep seeking for that open in OpenAI indeed. It’s clear who’s the most anthropocentric in this space.

rvz • yesterday at 10:34 AM

Exactly. They did not have to open up their research up and this is what happens when smart researchers are forced to squeeze performance gains out of existing hardware.

They don't have TPUs or access to the latest Vera Rubin GPUs either to get performance gains for free. All of the optimizations Deepseek have done are in software and it goes down to the PTX assembly level.

Compared to Anthropic who are celebrating in fixing a flickering issue in a terminal app which took months to fix.

➕ show 4 replies

SubiculumCode • yesterday at 5:59 PM

If American labs aren't publishing, it doesn't mean they aren't doing even more interesting work.

➕ show 1 reply

utopiah • yesterday at 12:09 PM

It's almost as if ... they were what OpenAI was when it started. Sad to see but glad someone is doing is.

epolanski • yesterday at 10:54 AM

R1 was very influential on US models development.

➕ show 1 reply

nelox • yesterday at 11:18 PM

Doing work ≠ publishing work

pmarreck • yesterday at 6:41 PM

They push the boundaries, alright. Of obtaining the results of work without doing the work themselves, which I hate to say it but this is classic Chinese machiavellianist business behavior:

https://www.cnbc.com/2026/06/24/anthropic-alibaba-distillati...

jmyeet • yesterday at 10:59 AM

Chinese companies (and labs) operate in conjunction with the CCP so whatever they're doing, it's because it's Chinese state policy.

What became clear when DeepSeek came onto the scene was that China was seeking to commoditize LLMs. They consider it an issue of national security not to be beholden to US tech companies when it comes to AI. And I, for one, fully endorse this policy.

Another data point on this is the black market for Claude tokens in China [1]. The chat logs themselves are a commodity to train models.

I believe that OpenAI in particular is a bet on a trillion dollar pot of gold that doesn't exist. Google, Microsoft, Amazon and Meta will all be fine. Anthropic is in a far better position than OpenAI (IMHO) but if DeepSeek or some other Chinese open weight model gets as good at coding, they're in real trouble too.

[1]: https://news.ycombinator.com/item?id=48667495

➕ show 2 replies

resters • yesterday at 1:42 PM

Thank you so much to everyone at DeepSeek who is working on this and who have the courage and generosity to open source this for humanity.

We in the United States will never forget!

For all the harm Trump does to the US at least he is helping China!

OtomotO • yesterday at 12:12 PM

The difference between greed and power

dakolli • yesterday at 11:56 AM

Its because our culture worships pieces of paper the government tells us is worth something.

➕ show 2 replies

godwinson__4-8 • yesterday at 6:54 PM

The idea that America is going to stay ahead of China is I think at this point clearly delusional. It's also just such silly framing. Why should 350 million people stay ahead of 1 billion people on the other side of the world? If an AI lab in China cures cancer or something do Americans lose?

So many Americans seem to (at least in theory) be ready to sign up for this ongoing confrontation with China. Does anyone think it isn't America who is poking the bear when it comes to the Thucydides trap? Why not try to get along? It occurs to me the only people more Chinese innovation would hurt are the mega cap class in the United States. Elon Musk certainly doesn't want BYD in the United States. Same story all the way down with these super capitalized AI companies. Most average Americans would probably be better off in a world where the United States and China got along. But its those Americans who will be called upon to suffer most of the burden if that trap ever springs.

➕ show 1 reply

darkoob12 • yesterday at 12:07 PM

Google and Microsoft publish more than enough and American universities are publishing the science beyond DeepSeek's engineering. That fact that you don't know about them means you're not following the science only reading hacker news.

➕ show 1 reply

DivingForGold • yesterday at 11:50 AM

Sure, in part by "stealing" from American AI companies with Distillation attacks:

https://yipzap.com/anthropic-accuses-alibaba-of-largest-ai-d...

➕ show 5 replies

alt Hacker News

Replies