logoalt Hacker News

Eonexustoday at 7:18 AM1 replyview on HN

I wonder what the tokens per second actually are. Yes, it does say "reading speed" but that varies for everyone, no?


Replies

cafkafktoday at 7:37 AM

That is a very fair point! I just ran a not very scientific benchmark with the system under load, and posted the raw logs in a sibling comment above, but the short answer is that it's hitting 11.94 tokens per second for generation - while it's also being a binary cache and CI build server.

Totally just vibes based, I think it goes up to 20+ tps when it's not under load (and that's me trying to be conservative). For context, reading speed at 250 wpm would be around 5 to 6 tokens per second.

show 1 reply