I just don’t believe that this can run inference on a 120 billion parameter model at actually useful...

alasdair_ • today at 2:30 AM • 1 reply • view on HN

I just don’t believe that this can run inference on a 120 billion parameter model at actually useful speeds.

Obviously any Turing machine can run any size of model, so the “120B” claim doesn’t mean much - what actually matters is speed and I just don’t believe this can be speedy enough on models that my $5000 5090-based pc is too slow for and lacks enough vram for.

Replies

mnkyprskbd • today at 2:32 AM

Look at the GPU and RAM spec; 120b seems workable.

➕ show 1 reply

alt Hacker News

Replies