logoalt Hacker News

nltoday at 2:59 AM3 repliesview on HN

Taalas is interesting. 16,000 TPS for Llama on a chip.

https://taalas.com/


Replies

micwtoday at 5:35 AM

On a very old model, it's more like 16.000 garbage words/s

show 2 replies
repletetoday at 7:29 AM

Its exciting to see, but look at the die size for only an 8b model

DeathArrowtoday at 6:07 AM

I wonder how many token per seconds can they get if they put Mercury 2 on a chip.