logoalt Hacker News

QuadmasterXLIItoday at 12:56 PM1 replyview on HN

headline hundred billion parameter, none of the official models are over 10 billion parameters. Curious.


Replies

Tuna-Fishtoday at 1:14 PM

The project is an inference framework which should support 100B parameter model at 5-7tok/s on CPU. No one has quantized a 100B parameter model to 1 trit, but this existing is an incentive for someone to do so.