Given the current best open-weight model (Kimi 2.6) is 1T A32B, I wonder how long we’ll have to wait...

bwv848 • yesterday at 8:08 PM • 1 reply • view on HN

Given the current best open-weight model (Kimi 2.6) is 1T A32B, I wonder how long we’ll have to wait for hardware like strix halo or gdx spark to be able to run it.

Replies

flockonus • yesterday at 11:29 PM

The bigger the [dense] models the more inference tends to take, it seems pretty linear.

In that sense, how long you'd need to wait to get say ~20tk/s .. maybe never.

(save a significant firmware update / translation layer)

alt Hacker News

Replies