logoalt Hacker News

antirezyesterday at 7:11 PM0 repliesview on HN

DS4 can process 460 prompt tokens per second. Not stellar but not so slow. On M3 max. See the benchmarks on readme.