logoalt Hacker News

aetherspawnyesterday at 11:06 PM0 repliesview on HN

I can run most models 100B and under on my MacBook Pro with Ultra 3 and 128GB of RAM at 25-70tok/sec-ish, and it was around $5k.

I don’t think I can run GLM 5.2 since it requires around 256GB of memory and the inference is probably too slow, but the future (planned) Apple M7 may be able to. The leaks say it will support up to ~700GB of RAM.

The models under 100B are kind of dumb as a brick and aren’t that useful unless you’re really bad at coding imo. They can’t really be trusted not to hallucinate so they’re not even good for data processing.