I think I still prefer local but I feel like that's because that most AI inference is kinda slow or comparable to local. But I recently tried out cerebras or (I have heard about groq too) and honestly when you try things at 1000 tk/s or similar, your mental model really shifts and becomes quite impatient. Cerebras does say that they don't log your data or anything in general and you would have to trust me to say that I am not sponsored by them (Wish I was tho) Its just that they are kinda nice.
But I still hope that we can someday actually have some meaningful improvements in speed too. Diffusion models seem to be really fast in architecture.
> Cerebras does say that they don't log your data or anything in general
Unil a judge says they must log everything, indefinitely