Given the cost of the system, how long would it take to be less expensive than, for example, a $200/mo Claude Max subscription with Opus running?
It's not really an apples-to-apples comparison - I enjoy playing around with LLMs, running different models, etc, and I place a relatively high premium on privacy. The computer itself was $2k about two years ago (and my employer reimbursed me for it), and 99% of my usage is for research questions which have relatively high output per input token. Using one for a coding assistant seems like it can run through a very high number of tokens with relatively few of them actually being used for anything. If I wanted a real-time coding assistant, I would probably be using something that fit in the 24GB of VRAM and would have very different cost/performance tradeoffs.
Never, local models are for hobby and (extreme) privacy concerns.
A less paranoid and much more economically efficient approach would be to just lease a server and run the models on that.