logoalt Hacker News

properbrewtoday at 5:31 PM1 replyview on HN

I think small models have a very good niche for specific tasks. I utilise a fine tuned Phi-4 model (smaller than this one) that fits in about 3.5gb of RAM (not vram) for the document processing side of things for the desktop app I develop (a bit of a shameless plug - whistle-enterprise.com).

If you have a very specific idea for local model use you can find a way to make it work very well, you don't even need to have a graphics card or NPU chip. You just have to be extremely constrained in how it's used. I think as a generic chatbot they're not great, I'd use a hosted SOTA model and I'm a big fan of local LLMs myself.


Replies

SeriousMtoday at 6:35 PM

Thank you for sharing your usecase! I like your product very much!

Could you talk a bit how you did the finetuning? Did you use unsloth or any other tool and how went the verification to proof the outcome?