logoalt Hacker News

Tepixtoday at 7:42 AM1 replyview on HN

The pp speeds are really slow (50), I think there‘s room for improvement still.


Replies

kamranjontoday at 8:06 AM

Ah yea after watching one of the creators youtube videos I realize these benchmarks are combining prefill and decode which isn't super helpful - it seems this struggles with the exact same bottlenecks as all strix halo setups, memory bandwidth. It seems this is still significantly slower than equivalent memory sizing on Mac hardware.

show 1 reply