Knowing very little about how to run these, how close are we to medium or larger businesses starting...

CubsFan1060 • today at 11:26 AM • 6 replies • view on HN

Knowing very little about how to run these, how close are we to medium or larger businesses starting to buy hardware to run models like this to keep the models local?

It’s expensive, and not as capable as the frontier models, but would have some pretty big benefits around privacy and agency.

Replies

wongarsu • today at 11:54 AM

I know of multiple businesses in Europe that have been doing that for a while with 70B models, and are upgrading hardware to run the new crop of 700B-1T models (really started around Kimi K2, but buying and hosting that kind of hardware takes time)

Not everyone is willing (or even legally able) to send their trade secrets to OpenAI or Anthropic

➕ show 2 replies

MikhailTal • today at 11:58 AM

This is not a new situation. This was happening also when good vision models like alexa net were coming through, especially for OCR. Companies had choice between cloud or self hosting with GPUs. But turns out, problem is usage patterns.

Your usage will peak during certain timezone work hours(even if you are a huge multinational company most of your engineers/users tend to be from only a few locations), so then you have a bunch of gpus doing nothing the rest of the day. especially with latency sensitive stuff, this is a decades old tradeoff problem, its not unique to llms

Havoc • today at 11:44 AM

It’s a ~750B model so still a hell of a lot of vram

Would need to be a pretty determined medium biz

moffkalast • today at 11:32 AM

So far there seems to be one major use-case for complete privacy, and that is legal work. You don't need top of the line models to search vast amounts of text in discovery and it needs to be completely confidential. There's quite a few lawyers over on r/localllama showing off their multi-GPU builds. Coincidentally they also have the vast funding required for it.

petesergeant • today at 11:31 AM

Unless you have genuine national security concerns, you’d be better off just negotiating a commercial agreement with privacy protections with a couple of existing vendors.

➕ show 2 replies

re-thc • today at 11:28 AM

> how close are we to medium or larger businesses starting to buy hardware to run models like this to keep the models local?

Years.

Even Microsoft said they don't have enough for Github and need to call Amazon.

Getting a few even at decent prices is hard. Unless the shortages goes down...

alt Hacker News

Replies