What can it actually run? The fact their benchmark plot refers to Llama 3.1 8b signals to me that it...

zmmmmm • today at 1:20 AM • 2 replies • view on HN

What can it actually run? The fact their benchmark plot refers to Llama 3.1 8b signals to me that it's hand implemented for that model and likely can't run newer / larger models. Why else would you benchmark such an outdated model? Show me a benchmark for gpt-oss-120b or something similar to that.

Replies

rjzzleep • today at 2:18 AM

The fact that so many people are focusing solely on massive LLM models is an oversight by people that narrowly focusing on a tiny (but very lucrative) subdomain of AI applications.

sanxiyn • today at 1:36 AM

Looking at their blog, they in fact ran gpt-oss-120b: https://furiosa.ai/blog/serving-gpt-oss-120b-at-5-8-ms-tpot-...

I think Llama 3 focus mostly reflects demand. It may be hard to believe, but many people aren't even aware gpt-oss exists.

➕ show 3 replies

alt Hacker News

Replies