>This model? You can run it at Q4 with 70GB of VRAM. >This beats the latest Sonnet while runn...

DeathArrow • yesterday at 5:16 PM • 2 replies • view on HN

>This model? You can run it at Q4 with 70GB of VRAM. >This beats the latest Sonnet while running locally

Not sure it will beat Sonet at Q4.

>This is approaching consumer level territory (you can get a Mac Studio with 128GB of RAM for ~3500 USD).

For $3500 I can get 7-8 years of GLM using coding plans, have a faster model and much better code quality.

Replies

> Not sure it will beat Sonet at Q4.

Very valid. Importance-weighted quantization and TurboQuant on model weights can reduce loss a lot compared to "traditional" Q4 so one can be hopeful.

> For $3500 I can get 7-8 years of GLM using coding plans, have a faster model and much better code quality

But you will own no computer, and that's also assuming prices stay what they are. Anyway my point was not whether or not it makes financial sense for everyone. A lot of people are very happy not owning their movies, software, games, cars or house. I'm just happy there is a future where the people can own and locally run the tech that was trained on their stolen data.

kobalsky • yesterday at 5:39 PM

> For $3500 I can get 7-8 years of GLM

mind sharing where's the go to place to pay for open models?

➕ show 3 replies

alt Hacker News

Replies