This is pretty harsh on DeepSeek. There are some significant innovations behind behind v2 and v3 l...

byefruit • 01/20/2025 • 1 reply • view on HN

This is pretty harsh on DeepSeek.

There are some significant innovations behind behind v2 and v3 like multi-headed latent attention, their many MoE improvements and multi-token prediction.

Replies

wrasee • 01/20/2025

I don’t think it’s that harsh. And I don’t also deny that they’re a capable competitor and will surely mix in their own innovations.

But would they be where they are if they were not able to borrow heavily from what has come before?

➕ show 2 replies

alt Hacker News

Replies