logoalt Hacker News

perfmodeyesterday at 11:21 PM1 replyview on HN

How’s the token throughput / response time?


Replies

simonwyesterday at 11:24 PM

Healthy!

  prefill: 30.91 t/s, generation: 29.58 t/s
From https://gist.github.com/simonw/31127f9025845c4c9b10c3e0d8612...
show 3 replies