DeepSeek v4

517 points • by impact_sy • today at 3:01 AM • 222 comments • view on HN

https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...

Comments

throwa356262 • today at 6:17 AM

Seriously, why can't huge companies like OpenAI and Google produce documentation that is half this good??

https://api-docs.deepseek.com/guides/thinking_mode

No BS, just a concise description of exactly what I need to write my own agent.

➕ show 1 reply

revolvingthrow • today at 5:42 AM

> pricing "Pro" $3.48 / 1M output tokens vs $4.40

I’d like somebody to explain to me how the endless comments of "bleeding edge labs are subsidizing the inference at an insane rate" make sense in light of a humongous model like v4 pro being $4 per 1M. I’d bet even the subscriptions are profitable, much less the API prices.

edit: $1.74/M input $3.48/M output on OpenRouter

➕ show 5 replies

fblp • today at 3:53 AM

There's something heartwarming about the developer docs being released before the flashy press release.

➕ show 2 replies

yanis_t • today at 4:34 AM

Already on Openrouter. Pro version is $1.74/m/input, $3.48/m/output, while flash $0.14/m/input, 0.28/m/output.

➕ show 2 replies

sidcool • today at 4:21 AM

Truly open source coming from China. This is heartwarming. I know if the potential ulterior motives.

➕ show 3 replies

mchusma • today at 4:43 AM

For comparison on openrouter DeepSeek v4 Flash is slightly cheaper than Gemma 4 31b, more expensive than Gemma 4 26b, but it does support prompt caching, which means for some applications it will be the cheapest. Excited to see how it compares with Gemma 4.

nthypes • today at 3:45 AM

https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...

Model was released and it's amazing. Frontier level (better than Opus 4.6) at a fraction of the cost.

➕ show 8 replies

seanobannon • today at 3:44 AM

Weights available here: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro

➕ show 1 reply

gbnwl • today at 3:49 AM

I’m deeply interested and invested in the field but I could really use a support group for people burnt out from trying to keep up with everything. I feel like we’ve already long since passed the point where we need AI to help us keep up with advancements in AI.

➕ show 4 replies

apexalpha • today at 6:17 AM

This FLash model might be affordable for OpenClaw. I run it on my mac 48gb ram now but it's slowish.

zkmon • today at 5:56 AM

They released 1.6 T pro base model on huggingface. First time I'm seeing a "T" model here.

bandrami • today at 5:54 AM

I don't mind that High Flyer completely ripped off Anthropic to do this so much as I mind that they very obviously waited long enough for the GAB to add several dozen xz-level easter eggs to it.

WhereIsTheTruth • today at 6:21 AM

Interesting note:

"Due to constraints in high-end compute capacity, the current service capacity for Pro is very limited. After the 950 supernodes are launched at scale in the second half of this year, the price of Pro is expected to be reduced significantly."

So it's going to be even cheaper

Imanari • today at 6:02 AM

Just tested it via openrounter in the Pi Coding agent and it regularly fails to use the read and write tool correctly, very disappointing. Anyone know a fix besides prompting "always use the provided tools instead of writing your own call"

➕ show 1 reply

CJefferson • today at 5:12 AM

What's the current best framework to have a 'claude code' like experience with Deepseek (or in general, an open-source model), if I wanted to play?

➕ show 4 replies

simonw • today at 4:35 AM

I like the pelican I got out of deepseek-v4-flash more than the one I got from deepseek-v4-pro.

https://simonwillison.net/2026/Apr/24/deepseek-v4/

Both generated using OpenRouter.

For comparison, here's what I got from DeepSeek 3.2 back in December: https://simonwillison.net/2025/Dec/1/deepseek-v32/

And DeepSeek 3.1 in August: https://simonwillison.net/2025/Aug/22/deepseek-31/

And DeepSeek v3-0324 in March last year: https://simonwillison.net/2025/Mar/24/deepseek/

➕ show 13 replies

rohanm93 • today at 5:37 AM

This is shockingly cheap for a near frontier model. This is insane.

For context, for an agent we're working on, we're using 5-mini, which is $2/1m tokens. This is $0.30/1m tokens. And it's Opus 4.6 level - this can't be real.

I am uncomfortable about sending user data which may contain PII to their servers in China so I won't be using this as appealing as it sounds. I need this to come to a US-hosted environment at an equivalent price.

Hosting this on my own + renting GPUs is much more expensive than DeepSeek's quoted price, so not an option.

➕ show 1 reply

zargon • today at 4:08 AM

The Flash version is 284B A13B in mixed FP8 / FP4 and the full native precision weights total approximately 154 GB. KV cache is said to take 10% as much space as V3. This looks very accessible for people running "large" local models. It's a nice follow up to the Gemma 4 and Qwen3.5 small local models.

➕ show 1 reply

xnx • today at 5:49 AM

Such different time now than early 2025 when people thought Deepaeek was going to kill the market for Nvidia.

storus • today at 5:09 AM

Oh well, I should have bought 2x 512GB RAM MacStudios, not just one :(

jessepcc • today at 4:01 AM

At this point 'frontier model release' is a monthly cadence, Kimi 2.6 Claude 4.6 GPT 5.5, the interesting question is which evals will still be meaningful in 6 months.

Aliabid94 • today at 3:54 AM

MMLU-Pro:

Gemini-3.1-Pro at 91.0

Opus-4.6 at 89.1

GPT-5.4, Kimi2.6, and DS-V4-Pro tied at 87.5

Pretty impressive

➕ show 1 reply

tcbrah • today at 6:03 AM

giving meta a run for its money, esp when it was supposed to be the poster child for OSS models. deepseek is really overshadowing them rn

clark1013 • today at 5:14 AM

Looking forward to DeepSeek Coding Plan

➕ show 1 reply

jdeng • today at 3:53 AM

Excited that the long awaited v4 is finally out. But feel sad that it's not multimodal native.

gardnr • today at 5:39 AM

865 GB: I am going to need a bigger GPU.

tariky • today at 5:27 AM

Anyone tried with make web UI with it? How good is it? For me opus is only worth because of it.

taosx • today at 3:47 AM

MErge? https://news.ycombinator.com/item?id=47885014

luyu_wu • today at 3:43 AM

For those who didn't check the page yet, it just links to the API docs being updated with the upcoming models, not the actual model release.

➕ show 3 replies

sibellavia • today at 5:32 AM

A few hours after GPT5.5 is wild. Can’t wait to try it.

luew • today at 5:35 AM

We will be hosting it soon at getlilac.com!

KaoruAoiShiho • today at 3:57 AM

SOTA MRCR (or would've been a few hours earlier... beaten by 5.5), I've long thought of this as the most important non-agentic benchmark, so this is especially impressive. Beats Opus 4.7 here

reenorap • today at 4:19 AM

Which version fits in a Mac Studio M3 Ultra 512 GB?

➕ show 1 reply

mariopt • today at 4:48 AM

Does deepseek has any coding plan?

➕ show 1 reply

aliljet • today at 4:35 AM

How can you reasonably try to get near frontier (even at all tps) on hardware you own? Maybe under 5k in cost?

➕ show 6 replies

sergiotapia • today at 5:40 AM

Using it with opencode sometimes it generates commands like:

    bash({"command":"gh pr create --title "Improve Calendar module docs and clean up idiomatic Elixir" --body "$(cat <<'EOF'
    Problem
    The Calendar modu...

like generating output, but not actually running the bash command so not creating the PR ultimately. I wonder if it's a model thing, or an opencode thing.

swrrt • today at 4:04 AM

Any visualised benchmark/scoreboard for comparison between latest models? DeepSeek v4 and GPT-5.5 seems to be ground breaking.

namegulf • today at 4:27 AM

Is there a Quantized version of this?

rvz • today at 4:00 AM

The paper is here: [0]

Was expecting that the release would be this month [1], since everyone forgot about it and not reading the papers they were releasing and 7 days later here we have it.

One of the key points of this model to look at is the optimization that DeepSeek made with the residual design of the neural network architecture of the LLM, which is manifold-constrained hyper-connections (mHC) which is from this paper [2], which makes this possible to efficiently train it, especially with its hybrid attention mechanism designed for this.

There was not that much discussion around it some months ago here [3] about it but again this is a recommended read of the paper.

I wouldn't trust the benchmarks directly, but would wait for others to try it for themselves to see if it matches the performance of frontier models.

Either way, this is why Anthropic wants to ban open weight models and I cannot wait for the quantized versions to release momentarily.

[0] https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...

[1] https://news.ycombinator.com/item?id=47793880

[2] https://arxiv.org/abs/2512.24880

[3] https://news.ycombinator.com/item?id=46452172

➕ show 1 reply

punkpeye • today at 5:48 AM

Incredible model quality to price ratio

ls612 • today at 4:07 AM

How long does it usually take for folks to make smaller distills of these models? I really want to see how this will do when brought down to a size that will run on a Macbook.

➕ show 2 replies

frozenseven • today at 4:11 AM

Better link:

https://news.ycombinator.com/item?id=47885014

https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro

hongbo_zhang • today at 4:35 AM

congrats

dhruv3006 • today at 4:58 AM

Ah now !

creamyhorror • today at 4:09 AM

[dead]

hubertzhang • today at 4:44 AM

[dead]

maryjeiel • today at 4:11 AM

[dead]

minhajulmahib • today at 4:23 AM

[flagged]

➕ show 1 reply

slopinthebag • today at 5:10 AM

OMG

OMG ITS HAPPENING

alt Hacker News

DeepSeek v4

Comments

🔗 View 2 more comments