Then again, I have a RTX 5090 + 96GB DDR5-6000 that crushes the spark on prompt processing of gpt-os...

EnPissant • today at 3:47 AM • 0 replies • view on HN

Then again, I have a RTX 5090 + 96GB DDR5-6000 that crushes the spark on prompt processing of gpt-oss-120b (something like 2-3x faster), while token generation is pretty close. The cost I paid was ~$3200 for the entire computer. With the currently inflated RAM prices, it would probably be closer to the dell.

So while I think the Strix Halo is a mostly useless machine for any kind of AI, and I think the spark is actually useful, I don't think pure inference is a good use case for them.

It probably only makes sense as a dev kit for larger cloud hardware.

alt Hacker News