logoalt Hacker News

captainblandlast Thursday at 2:36 PM2 repliesview on HN

I think the issue is that LLMs are a cash problem as much as they are a technical problem. Consumer hardware architectures are still pretty unfriendly to running models which are actually competitive to useful models so if you want to even do inference on a model that's going to reliably give you decent results you're basically in enterprise territory. Unless you want to do it really slowly.

The issue that I see is that Nvidia etc. are incentivised to perpetuate that so the open source community gets the table scraps of distills, fine-tunes etc.


Replies

butlikelast Thursday at 3:02 PM

You got me thinking that what's going to happen is some GPU maker is going to offer a subsidized GPU (or RAM stick, or ...whatever) if the GPU can do calculations while your computer is idle, not unlike Folding@home. This way, the company can use the distributed fleet of customer computers to do large computations, while the customer gets a reasonably priced GPU again.

show 1 reply
cyanydeezlast Thursday at 8:37 PM

Consumer hardware is there. grab a mac or AMD395+ and Qwen coder and Cline or Open code and you're getting 80% of the real efficiency.

show 1 reply