With smaller models becoming more efficient and harder continually improving I think the sweet spot for local LLM computing will arrive in a couple years.
So many comments like to highlight that you can buy a Mac Studio with 512GB of RAM for $10K, but that's a huge amount of money to spend on something that still can't compete with a $2/hour rented cloud GPU server in terms of output speed. Even that will be lower quality and slower than the $20/month plan from the LLM provider of your choice.
The only reasons to go local are if you need it (privacy, contractual obligations, regulations) or if you're a hardcore hobbiest who values running it yourself over quality and speed of output.
> Framework (the modular laptop company) has announced a desktop that can be configured up to 128GB unified RAM, but it's still going to come in at around 2-2.5k depending on your config.
Framework is getting a lot of headlines for their brand recognition but there are a growing number of options with the same AMD Strix Halo part. Here's a random example I found from a Google search - https://www.gmktec.com/products/amd-ryzen%E2%84%A2-ai-max-39...
All of these are somewhat overpriced right now due to supply and demand. If the supply situation is alleviated they should come down in price.
They're great for what they are, but their memory bandwidth is still relatively limited. If the 128GB versions came down to $1K I might pick one up, but at the $2-3K price range I'd rather put that money toward upgrading my laptop to an M4 MacBook Pro with 128GB of RAM.