logoalt Hacker News

Open Reproduction of DeepSeek-R1

166 pointsby yogthostoday at 1:14 PM15 commentsview on HN

Comments

Tiberiumtoday at 2:04 PM

Last update over a year ago, so I hope (2025) gets added to the title:

> [2025/05/26] (Step 1 completed!) We release Mixture-of-Thoughts--a curated reasoning dataset of 350k verified traces distilled from R1. The dataset spans tasks in mathematics, coding, and science, and is designed to teach language models to reason step-by-step. We also provide a recipe to train OpenR1-Distill-7B, which replicates the reasoning capabilities of deepseek-ai/DeepSeek-R1-Distill-Qwen-7B and marks the completion of step 1 in the Open R1 project.

Doesn't look like they managed to actually reproduce R1, and only stopped on Step 1 out of their 3-step plan.

show 1 reply
aesthesiatoday at 3:09 PM

If you really want to see fully open training pipelines for modern LLMs, Olmo and to a lesser extent Nemotron are what you should look at.

https://github.com/allenai/OLMo

https://github.com/NVIDIA-NeMo/Nemotron

show 1 reply
madiatortoday at 2:17 PM

Check out OpenThoughts. It has a widely used dataset, a model that beats the deepseek's smaller reasoning models, and a paper that talks in detail about the data curation methodology.

https://www.open-thoughts.ai/

show 2 replies
yieldcrvtoday at 2:41 PM

Too old now

christkvtoday at 2:40 PM

What is the estimated cost these days to train something like this to conclusion?

show 1 reply
poppafuzetoday at 3:57 PM

"This will likely involve curating new, large-scale datasets for math, reasoning, and code.". ... everybody likes to hand-wave on this .

show 1 reply
RedMagicBoxtoday at 3:35 PM

[dead]