I would love to see real-life tokens/sec values advertised for one or various specific open source models.
I'm currently shopping for offline hardware and it is very hard to estimate the performance I will get before dropping $12K, and would love to have a baseline that I can at least always get e.g. 40 tok/s running GPT-OSS-120B using Ollama on Ubuntu out of the box.
Look for llmfit on github. This will help with that analysis. I've found it reasonably accurate. If you have Ollama already installed, it can download the relevant models directly.