logoalt Hacker News

Fable 5 pushed Gemma 4 to 255 tok/s on WebGPU

24 pointsby kirubakarantoday at 2:14 PM6 commentsview on HN

Comments

mike_hearntoday at 3:52 PM

That's very impressive. What's the best way to run these kernels natively on a Mac? I saw that there's a way to plug Claude into Apple's Foundation Models framework, and there's a CLI tool that can access models via that framework. It might be useful to have something so fast and good available via a small CLI tool for various purposes, especially when connected with a small suite of tools I have for things like file editing, showing, simple agentic purposes etc.

show 1 reply
nmfishertoday at 3:51 PM

It's not immediately clear, but this seems to be 250 tok/s on an M4 Max.

For comparison, the current agent swarm challenge on HF is at 508 tok/s on a A10G GPU:

https://huggingface.co/spaces/gemma-challenge/gemma-dashboar...

freedombentoday at 3:20 PM

More of a meta comment, but I really wish anthropic would say something about their plans for Fable. We're all just kind of left here floating and aimless, with no idea of what to expect

show 2 replies