I would love to see real-life tokens/sec values advertised for one or various specific open sou...

mmoustafa • yesterday at 9:43 PM • 1 reply • view on HN

I would love to see real-life tokens/sec values advertised for one or various specific open source models.

I'm currently shopping for offline hardware and it is very hard to estimate the performance I will get before dropping $12K, and would love to have a baseline that I can at least always get e.g. 40 tok/s running GPT-OSS-120B using Ollama on Ubuntu out of the box.

Replies

hpcjoe • yesterday at 10:35 PM

Look for llmfit on github. This will help with that analysis. I've found it reasonably accurate. If you have Ollama already installed, it can download the relevant models directly.

alt Hacker News

Replies