logoalt Hacker News

tyretoday at 12:12 AM2 repliesview on HN

Given the cost of the system, how long would it take to be less expensive than, for example, a $200/mo Claude Max subscription with Opus running?


Replies

Workaccount2today at 2:58 AM

Never, local models are for hobby and (extreme) privacy concerns.

A less paranoid and much more economically efficient approach would be to just lease a server and run the models on that.

show 1 reply
mechagodzillatoday at 12:23 AM

It's not really an apples-to-apples comparison - I enjoy playing around with LLMs, running different models, etc, and I place a relatively high premium on privacy. The computer itself was $2k about two years ago (and my employer reimbursed me for it), and 99% of my usage is for research questions which have relatively high output per input token. Using one for a coding assistant seems like it can run through a very high number of tokens with relatively few of them actually being used for anything. If I wanted a real-time coding assistant, I would probably be using something that fit in the 24GB of VRAM and would have very different cost/performance tradeoffs.