logoalt Hacker News

anoncowtoday at 4:14 AM1 replyview on HN

3000 tokens per sec on 32 mb Ram?


Replies

fc417fc802today at 4:56 AM

fast != practical

You can get lots of tokens per second on the CPU if the entire network fits in L1 cache. Unfortunately the sub 64 kiB model segment isn't looking so hot.

But actually ... 3000? Did GP misplace one or two zeros there?