Kimi Released Kimi K2.5, Open-Source Visual SOTA-Agentic Model

335 points • by nekofneko • today at 5:42 AM • 137 comments • view on HN

Comments

Huggingface Link: https://huggingface.co/moonshotai/Kimi-K2.5

1T parameters, 32b active parameters.

License: MIT with the following modification:

Our only modification part is that, if the Software (or any derivative works thereof) is used for any of your commercial products or services that have more than 100 million monthly active users, or more than 20 million US dollars (or equivalent in other currencies) in monthly revenue, you shall prominently display "Kimi K2.5" on the user interface of such product or service.

➕ show 3 replies

bertili • today at 9:02 AM

The "Deepseek moment" is just one year ago today!

Coincidence or not, let's just marvel for a second over this amount of magic/technology that's being given away for free... and how liberating and different this is than OpenAI and others that were closed to "protect us all".

➕ show 4 replies

jumploops • today at 6:10 AM

> For complex tasks, Kimi K2.5 can self-direct an agent swarm with up to 100 sub-agents, executing parallel workflows across up to 1,500 tool calls.

> K2.5 Agent Swarm improves performance on complex tasks through parallel, specialized execution [..] leads to an 80% reduction in end-to-end runtime

Not just RL on tool calling, but RL on agent orchestration, neat!

➕ show 3 replies

vinhnx • today at 9:37 AM

One thing caught my eyes is that besides K2.5 model, Moonshot AI also launched Kimi Code (https://www.kimi.com/code), evolved from Kimi CLI. It is a terminal coding agent, I've been used it last month with Kimi subscription, it is capable agent with stable harness.

GitHub: https://github.com/MoonshotAI/kimi-cli

➕ show 2 replies

Reubend • today at 7:12 AM

I've read several people say that Kimi K2 has a better "emotional intelligence" than other models. I'll be interested to see whether K2.5 continues or even improves on that.

➕ show 3 replies

Alifatisk • today at 10:13 AM

Have you all noted that the latest releases (Qwen3 max thinking, now Kimi k2.5) from Chinese companies are benching against Claude opus now and not Sonnet? They are truly catching up, almost at the same pace?

➕ show 4 replies

Topfi • today at 8:24 AM

K2 0905 and K2 Thinking shortly after that have done impressively well in my personal use cases and was severely slept on. Faster, more accurate, less expensive, more flexible in terms of hosting and available months before Gemini 3 Flash, I really struggle to understand why Flash got such positive attention at launch.

Interested in the dedicated Agent and Agent Swarm releases, especially in how that could affect third party hosting of the models.

➕ show 1 reply

zmmmmm • today at 7:26 AM

Curious what would be the most minimal reasonable hardware one would need to deploy this locally?

➕ show 3 replies

spaceman_2020 • today at 6:42 AM

Kimi was already one of the best writing models. Excited to try this one out

➕ show 1 reply

dev_l1x_be • today at 1:46 PM

I had these weird situations like some models are refusing to use SSH as a tool. Not sure if it was the coding tool limitation or it is baked into in some of the models.

Barathkanna • today at 9:55 AM

A realistic setup for this would be a 16× H100 80GB with NVLink. That comfortably handles the active 32B experts plus KV cache without extreme quantization. Cost-wise we are looking at roughly $500k–$700k upfront or $40–60/hr on-demand, which makes it clear this model is aimed at serious infra teams, not casual single-GPU deployments. I’m curious how API providers will price tokens on top of that hardware reality.

➕ show 4 replies

throwaw12 • today at 10:29 AM

Congratulations, great work Kimi team.

Why is that Claude still at the top in coding, are they heavily focused on training for coding or is it their general training is so good that it performs well in coding?

Someone please beat the Opus 4.5 in coding, I want to replace it.

➕ show 3 replies

Jackson__ • today at 8:10 AM

As your local vision nut, their claims about "SOTA" vision are absolutely BS in my tests.

Sure it's SOTA at standard vision benchmarks. But on tasks that require proper image understanding, see for example BabyVision[0] it appears very much lacking compared to Gemini 3 Pro.

[0] https://arxiv.org/html/2601.06521v1

pu_pe • today at 9:07 AM

I don't get this "agent swarm" concept. You set up a task and they boot up 100 LLMs to try to do it in parallel, and then one "LLM judge" puts it all together? Is there anywhere I can read more about it?

➕ show 4 replies

striking • today at 7:25 AM

https://archive.is/P98JR

simonw • today at 10:20 AM

Pretty cute pelican https://tools.simonwillison.net/svg-render#%3Csvg%20viewBox%...

➕ show 4 replies

teiferer • today at 11:02 AM

Can we please stop calling those models "open source"? Yes the weights are open. So, "open weight" maybe. But the source isn't open, the thing that allows to re-create it. That's what "open source" used to mean. (Together with a license that allows you to use that source for various things.)

hmate9 • today at 10:05 AM

About 600GB needed for weights alone, so on AWS you need an p5.48xlarge (8× H100) which costs $55/hour.

jdeng • today at 10:52 AM

Glad to to see open source models are catching up and treat vision as first-class citizen (a.k.a native multimodal agentic model). GLM and Qwen models takes different approach, by having a base model and a vision variant (glm-4.6 vs glm-4.6v).

I guess after Kimi K2.5, other vendors are going to the same route?

Can't wait to see how this model performs on computer automation use cases like VITA AI Coworker.

https://www.vita-ai.net/

monkeydust • today at 9:45 AM

Is this actually good or just optimized heavily for benchmarks? I am hopefully its the former based on the writeup but need to put it through its paces.

pplonski86 • today at 7:22 AM

There are so many models, is there any website with list of all of them and comparison of performance on different tasks?

➕ show 2 replies

DeathArrow • today at 6:36 AM

Those are some impressive benchmark results. I wonder how well it does in real life.

Maybe we can get away with something cheaper than Claude for coding.

➕ show 1 reply

mangolie • today at 6:01 AM

they cooked

maximgeorge • today at 7:05 AM

[dead]

lrvick • today at 6:55 AM

Actually open source, or yet another public model, which is the equivalent of a binary?

URL is down so cannot tell.

➕ show 2 replies

billyellow • today at 5:54 AM

Cool

rvz • today at 8:08 AM

The chefs at Moonshot have cooked once again.

alt Hacker News

Kimi Released Kimi K2.5, Open-Source Visual SOTA-Agentic Model

Comments