logoalt Hacker News

CamperBob2last Sunday at 5:34 PM1 replyview on HN

As far as I'm aware, they all are. There are only five important foundation models in play -- Gemini, GPT, X.ai, Claude, and Deepseek. (edit: forgot Claude)

Everything from China is downstream of Deepseek, which some have argued is basically a protege of ChatGPT.


Replies

kingstnaplast Sunday at 6:10 PM

Not true, Qwen from Alibaba does lots of random architectures.

Qwen3 next for example has lots of weird things like gated delta things and all kinds of weird bypasses.

https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d...

show 2 replies