The dsv4 flash is 158B params in total. It is possible to run locally but will require all my system RAM.
Also, a lot of my day-to-day tasks perform the same on both small and bigger models: summarize a web page, draft a response, translations, quick web search, etc.
dsv4 flash has 284 billion parameters, not 158 billion.
Huggingface's little parameter count badge seems unreliable.
Sorry, I meant non-locally.
I'm assuming privacy is not a concern since you mentioned using Deepseek already. The cost of V4 Flash for small tasks is so minuscule as to be almost free, and you don't have to deal with a churning laptop (or even buying a high-end laptop, for someone who doesn't already have one).
I guess what I'm really asking is, what's the advantage of using these small local models if privacy isn't a concern?