True. I was thinking more of power users. Do you think Opus level capabilities will run on your average laptop in a year? I think that's pretty far away if ever.
You can demonstrate "running" the latest open Kimi or GLM model on a top-of-the-line laptop at very low throughput (Kimi at 2 tok/s, which is slow when you account for thinking time) today, courtesy of Flash-MoE with SSD weights offload. That's not Opus-like, it's not an "average" laptop and it's not really usable for non-niche purposes due to the low throughput. But it's impressive in a way, and it does give a nice idea of what might be feasible down the line.
You can demonstrate "running" the latest open Kimi or GLM model on a top-of-the-line laptop at very low throughput (Kimi at 2 tok/s, which is slow when you account for thinking time) today, courtesy of Flash-MoE with SSD weights offload. That's not Opus-like, it's not an "average" laptop and it's not really usable for non-niche purposes due to the low throughput. But it's impressive in a way, and it does give a nice idea of what might be feasible down the line.