I'd love to know how you fit smaller models into your workflow. I have an M4 Macbook Pro w/...

johnmaguire • today at 6:54 PM • 4 replies • view on HN

I'd love to know how you fit smaller models into your workflow. I have an M4 Macbook Pro w/ 128GB RAM and while I have toyed with some models via ollama, I haven't really found a nice workflow for them yet.

Replies

philipkglass • today at 7:00 PM

It really depends on the tasks you have to perform. I am using specialized OCR models running locally to extract page layout information and text from scanned legal documents. The quality isn't perfect, but it is really good compared to desktop/server OCR software that I formerly used that cost hundreds or thousands of dollars for a license. If you have similar needs and the time to try just one model, start with GLM-OCR.

If you want a general knowledge model for answering questions or a coding agent, nothing you can run on your MacBook will come close to the frontier models. It's going to be frustrating if you try to use local models that way. But there are a lot of useful applications for local-sized models when it comes to interpreting and transforming unstructured data.

➕ show 1 reply

Bluecobra • today at 7:30 PM

I didn’t realize that you can get 128GB of memory in a notebook, that is impressive!

➕ show 2 replies

echelon • today at 8:11 PM

Shouldn't we prioritize large scale open weights and open source cloud infra?

An OpenRunPod with decent usage might encourage more non-leading labs to dump foundation models into the commons. We just need infra to run it. Distilling them down to desktop is a fool's errand. They're meant to run on DC compute.

I'm fine with running everything in the cloud as long as we own the software infra and the weights.

This is conceivably the only way we could catch up to Claude Code is to have the Chinese start releasing their best coding models and for them to get significant traction with companies calling out to hosted versions. Otherwise, we're going to be stuck in a take off scenario with no bridge.

➕ show 1 reply

saltwounds • today at 7:11 PM

I use Raycast and connect it to LM Studio to run text clean up and summaries often. The models are small enough I keep them in memory more often than not

alt Hacker News

Replies