Kimi K2 1T model runs on 2 512GB M3 Ultras

217 points • by jeudesprits • last Sunday at 1:04 PM • 108 comments • view on HN

Comments

Kimi K2 is a really weird model, just in general.

It's not nearly as smart as Opus 4.5 or 5.2-Pro or whatever, but it has a very distinct writing style and also a much more direct "interpersonal" style. As a writer of very-short-form stuff like emails, it's probably the best model available right now. As a chatbot, it's the only one that seems to really relish calling you out on mistakes or nonsense, and it doesn't hesitate to be blunt with you.

I get the feeling that it was trained very differently from the other models, which makes it situationally useful even if it's not very good for data analysis or working through complex questions. For instance, as it's both a good prose stylist and very direct/blunt, it's an extremely good editor.

I like it enough that I actually pay for a Kimi subscription.

➕ show 11 replies

Kim_Bruning • last Sunday at 2:02 PM

Kimi K2 is a very impressive model! It's particularly un-obsequious, which makes it useful for actually checking your reasoning on things.

Some especially older ChatGPT models will tell you that everything you say is fantastic and great. Kimi -on the other hand- doesn't mind taking a detour to question your intelligence and likely your entire ancestry if you ask it to be brutal.

➕ show 2 replies

sfc32 • last Sunday at 9:14 PM

A single 512GB M3 Ultra is $9,499.00

https://www.apple.com/shop/buy-mac/mac-studio/apple-m3-ultra...

➕ show 1 reply

mehdibl • last Sunday at 3:57 PM

Claims as always misleading as they don't show the context length or prefill if you use a lot of context. As it will be fun waiting minutes for a reply.

smlacy • last Sunday at 10:50 PM

Is there a linux equivalent of this setup? I see some mention of RDNA support for linux distros, but it's unclear to me if this is hardware-specific (requires ConnectX or in this case Apple Thunderbolt) or is there something interesting that can be done with "vanilla 10G NIC" hardware?

➕ show 2 replies

websiteapi • last Sunday at 2:08 PM

I get tempted to buy a couple of these, but I just feel like the amortization doesn’t make sense yet. Surely in the next few years this will be orders of magnitude cheaper.

➕ show 6 replies

rubymamis • last Sunday at 4:57 PM

What benchmarks are good these days? I generally just try different models on Cursor, but most of the open weight models aren't available there (Deepseak v3.2, Kimi K2 has some problems with formatting, and many others are missing) so I'd be curious to see some benchmarks - especially for non-web stuff (C++, Rust, etc).

Alifatisk • last Sunday at 1:54 PM

You should mention that it is 4bit quant. Still very impressive!

➕ show 2 replies

storus • last Sunday at 7:55 PM

Does this also run with Exo Labs' token pre-fill acceleration using DGX Spark? I.e. take 2 Sparks and 2 MacStudios and get a comparable inference speed to what 2x M5 Ultras will be able to do?

zkmon • last Sunday at 7:15 PM

Isn't it the same model which won the competition of drawing a real-time clock recently?

iwwr • last Sunday at 7:11 PM

What is it using for interconnect?

➕ show 1 reply

macshome • last Sunday at 5:30 PM

Is this using the new RDMA over Thunderbolt support form macOS 26.2?

ansc • last Sunday at 9:02 PM

Is there no API for the Kimi K2 Instruct...?

alt Hacker News

Kimi K2 1T model runs on 2 512GB M3 Ultras

Comments