I think the fact that DeepSeek trains on competitor queries (i.e., distillation) — along with using ...

nsoonhui • last Thursday at 12:26 AM • 1 reply • view on HN

I think the fact that DeepSeek trains on competitor queries (i.e., distillation) — along with using banned Nvidia chips — helps explain how it can achieve such low training costs (USD 6 million vs. billions) while delivering only slightly worse performance than its American counterparts. It also undermines the narrative that DeepSeek or China is posing a serious challenge to the U.S. lead in AI. The gap may be closing, but the initial reactions now seem knee-jerk.

That the discussion has being hijacked and shifted to moral superiority is really unfortunate, because that was never the point in the first place.

Replies

whimsicalism • last Thursday at 2:27 AM

These models never cost billions to train and I doubt the final training run for models like GPT-4 cost more than 8 figures. 6 million is definitely cheaper and I would attribute that to distillation.

alt Hacker News

Replies