alt
Hacker News
perfmode
•
yesterday at 11:21 PM
•
1 reply
•
view on HN
How’s the token throughput / response time?
Replies
simonw
•
yesterday at 11:24 PM
Healthy!
prefill: 30.91 t/s, generation: 29.58 t/s
From
https://gist.github.com/simonw/31127f9025845c4c9b10c3e0d8612...
➕ show 3 replies
Healthy!
From https://gist.github.com/simonw/31127f9025845c4c9b10c3e0d8612...