Except it’s not really a fair comparison, since DeepSeek is able to take advantage of a lot of the r...

wrasee • 01/20/2025 • 7 replies • view on HN

Except it’s not really a fair comparison, since DeepSeek is able to take advantage of a lot of the research pioneered by those companies with infinite budgets who have been researching this stuff in some cases for decades now.

The key insight is that those building foundational models and original research are always first, and then models like DeepSeek always appear 6 to 12 months later. This latest move towards reasoning models is a perfect example.

Or perhaps DeepSeek is also doing all their own original research and it’s just coincidence they end up with something similar yet always a little bit behind.

Replies

matthewdgreen • 01/20/2025

This is what many folks said about OpenAI when they appeared on the scene building on foundational work done at Google. But the real point here is not to assign arbitrary credit, it’s to ask how those big companies are going to recoup their infinite budgets when all they’re buying is a 6-12 month head start.

➕ show 1 reply

techload • 01/20/2025

You can learn more about DeepSeek and Liang Wenfeng here: https://www.chinatalk.media/p/deepseek-ceo-interview-with-ch...

➕ show 3 replies

byefruit • 01/20/2025

This is pretty harsh on DeepSeek.

There are some significant innovations behind behind v2 and v3 like multi-headed latent attention, their many MoE improvements and multi-token prediction.

➕ show 1 reply

wrasee • 01/20/2025

Also don’t forget that if you think some of the big names are playing fast and loose with copyright / personal data then DeepSeek is able to operate in a regulatory environment that has even less regard for such things, especially so for foreign copyright.

➕ show 1 reply

gizmo • 01/20/2025

Fast following is still super hard. No AI startup in Europe can match DeepSeek for instance, and not for lack of trying.

➕ show 2 replies

h8hawk • 01/20/2025

That’s totally not true.

https://epoch.ai/gradient-updates/how-has-deepseek-improved-...

netdur • 01/20/2025

Didn't DeepSeek's CEO say that Llama is two generations behind, and that's why they didn't use their methods?

alt Hacker News

Replies