I can (barely, but sustainably) run Q3.5 397B on my Mac Studio with 256GB unified. It cost $10,000 but that's well within reach for most people who are here, I expect.
$10k is well outside my budget for frivolous computer purchases.
I'm running it on my Intel Xeon W5 with 256GB of DDR5 and Nvidia 72GB VRAM. Paid $7-8k for this system. Probably cost twice as much now.
Using UD-IQ4_NL quants.
Getting 13 t/s. Using it with thinking disabled.
For some reason you were being downvoted but I enjoy hearing how people are running open weights models at home (NOT in the cloud), and what kind of hardware they need, even if it's out of my price range.
you have proved my point
Hacker News moment