logoalt Hacker News

greenknighttoday at 4:02 AM5 repliesview on HN

The thing is, it doesnt need to beat 4.7. it just needs to do somewhat well against it.

This is free... as in you can download it, run it on your systems and finetune it to be the way you want it to be.


Replies

libraryofbabeltoday at 5:51 AM

> you can download it, run it on your systems

In theory, sure, but as other have pointed out you need to spend half a million on GPUs just to get enough VRAM to fit a single instance of the model. And you’d better make sure your use case makes full 24/7 use of all that rapidly-depreciating hardware you just spent all your money on, otherwise your actual cost per token will be much higher than you think.

In practice you will get better value from just buying tokens from a third party whose business is hosting open weight models as efficiently as possible and who make full use of their hardware. Even with the small margin they charge on top you will still come out ahead.

show 2 replies
p1esktoday at 4:09 AM

Do you think a lot of people have “systems” to run a 1.6T model?

show 3 replies
onchaininteltoday at 4:23 AM

Completely agree, not suggesting it needs ot just genuinely curious. Love that it can be run locally though. Open source LLMs punching back pretty hard against proprietary ones in the cloud lately in terms of performance.

kelseyfrogtoday at 4:14 AM

What's the hardware cost to running it?

show 3 replies
johnmaguiretoday at 4:09 AM

... if you have 800 GB of VRAM free.

show 1 reply