I like the approach of running everything locally. I'm strongly of the opinion that the privacy angle for local models is going to keep getting stronger and more relevant. The amount of articles that come out about accidents happening because of people handing too much context to cloud models the more self reinforcing this will become.
> I like the approach of running everything locally. I'm strongly of the opinion that the privacy angle for local models is going to keep getting stronger and more relevant.
In HN circles perhaps. Average Joes don’t care.
Another angle is when you're passing untrusted content to the AI service, e.g. anything from using it to crawl websites to spam-detection on new forum user posts.
You can trigger the the service's ToS violation or worse, get tipped off to law enforcement for something you didn't even write.
local is best for privacy, but i personally think you don't need to go local.
anthropic, google, openai etc, decided that their consumer ai plans would not be private. partly to collect training data, the other half to employ moderators to review user activity for safety.
we trust that human moderators will not review and flag our icloud docs, onedrive or gmail, or aggregate such documents into training data for llms. it became the norm that an llm is somehow not private. it became a norm that you can't opt out of training, even on paid plans (see meta and google); or if you can opt out of training, you can't opt out of moderation.
cloud models with a zero retention privacy policy are private enough for almost everyone, the subscriptions, google search, ai search engines are either 'buying' your digital life or covering themselves for legal reasons.
you can and should have private cloud services, and if legal agreement is not enough, cryptographic attestation is already used in compute, with AWS nitro enclaves and other providers.
The other thing, is encrypted inferencing a thing/service currently? I want to run my own models locally just because if I'm going to be chatting to it about my day to day life why send it to a server in plaintext.
That's the way things have to go. Business risk is too high having everything ran over exposed networks.
It's only half of the solution though. If the models are trained in a closed way, they can prioritize values encoded during training even if that's not what you want (example: ask the open Chinese models about Tiananmen). It's not beyond imagining that these models would e.g. try to send your data to authorities or advertisers when their training says so, even if you run them locally.
So the full solution would be models trained in an open verifiable way and running locally.