Super interesting data.
I do question this finding:
> the small model category as a whole is seeing its share of usage decline.
It's important to remember that this data is from OpenRouter... a API service. Small models are exactly those that can be self-hosted.
It could be the case that total small model usage has actually grown, but people are self-hosting rather than using an API. OpenRouter would not be in a position to determine this.
I like to see stats like that, but I find it very concerning that OpenRouter don't mind inspecting its user/customer data without shame.
Even if you pretend that the classifier respect anonymity, if I pay for the inference, I would expect that it would be a closed tube with my privacy respected. If at least it was for "safety" checks, I don't like that but I would almost understand, now it is for them to have "marketing data".
Imagine, and regarding the state of the world it might come soon, that you have whatsapp or telegram that inspect all the messages that you send to give reports like:
- 20% of our users speak about their health issues
- 30% of messages are about annoying coworkers
- 15% are messages comparing dick sizes
Very interesting how Singapore ranks 2nd in terms of token volume. I wonder if this is potentially Chinese usage via VPN, or if Singaporean consumers and firms are dominating in AI adoption.
Also interesting how the 'roleplaying' category is so dominant, makes me wonder if Google's classifier sees a system prompt with "Act as a X" and classifies that as roleplay vs the specific industry the roleplay was intended to serve.
> The noticeable spike [~20 percentage points] in May in the figure above [tool invocations] was largely attributable to one sizable account whose activity briefly lifted overall volumes.
The fact that one account can have such a noticeable effect on token usage is kind of insane. And also raises the question of how much token usage is coming from just one or five or ten sizeable accounts.
The 'Glass slipper' idea makes sense to me; people have a bunch of different ideas to try on AIs, and try it as new models come out, and once a model does it well they stick with it for a while.
According to the report, 52% of all open-source AI is used for *roleplaying*. They attribute it to fewer content filters and higher creativity.
I'm pretty surprised by that, but I guess that also selects for people who would use openrouter
I worry that OpenRouter's Apps leaderboard incentivizes tools (e.g. Cline/Kilo) to burn through tokens to climb the ranks, meanwhile penalizing being context-efficient.
Here is the thing: they made good enough open weight models available and affordable, then found that people used them more than before. I am not trying to diminish the value here but I don’t think this is the headline.
my highlights of this report: https://news.smol.ai/issues/25-12-04-openrouter
Overall really interesting read, but I'm having trouble processing this:
> OpenRouter performs internal categorization on a random sample comprising approximately 0.25% of all prompts
How can you arrive at any conclusion with such a small random sample size?
The open weight model data is very interesting. I missed the release of Minimax M2. The benchmarks seem insanely impressive for its size. I would suspect benchmaxing but why would people be using it if it wasn’t useful?
> The metric reflects the proportion of all tokens served by reasoning models, not the share of "reasoning tokens" within model outputs.
I'd be interested in a clarification on the reasoning vs non-reasoning metric.
Does this mean the reasoning total is (input + reasoning + output) tokens? Or is it just (input + output).
Obviously the reasoning tokens would add a ton to the overall count. So it would be interesting to see it on an apples to apples comparison with non reasoning models.
This is interesting, but I found it moderately disturbing that they spend a LOT of effort up front talking about how they don’t have any access to the prompts or responses. And then they reveal that they did actually have access to the text and they spend 80% of the rest of the paper analyzing the content.
*State of non-enterprise, indie AI
All this data confirms that OpenRouter’s enterprise ambitions will fail. It’s a nice product for running Chinese models tho
This is really amazing data. Super interesting read
I am a person who wants to maintain a distance from the AI-hype train, but seeing a chart like this [1], I can't help think that we are nowhere near the peak. The weekly token consumption keeps on rising, and it's already in trillions, and this ignores a lot of consumption happening directly through APIs.
Nvidia could keep delivering record-breaking numbers, and we may well see multiple companies hit six, seven, or even eight trillion dollars in market cap within a couple of years. While I am skeptical of claims like AI will make programming obsolete, but it’s clear that the adoption is still going like crazy and it's hard to anticipate when the plateau happens.
[1]: https://openrouter.ai/state-of-ai#open-vs_-closed-source-mod...