logoalt Hacker News

npntoday at 6:55 AM1 replyview on HN

what gut? we are already doing that. there are a lot of "tiny" LLMs that are useful: M$ Phi-4, Gemma 3/3n, Qwen 7B... There are even smaller models like Gemma 270M that is fine tuned for function calls.

they are not flourish yet because of the simple reason: the frontier models are still improving. currently it is better to use frontier models than training/fine-tuning one by our own because by the time we complete the model the world is already moving forward.

heck even distillation is a waste of time and money because newer frontier models yield better outputs.

you can expect that the landscape will change drastically in the next few years when the proprietary frontier models stop having huge improvements every version upgrade.


Replies

znnajdlatoday at 7:10 AM

I’ve tried those tiny LLMs and they don’t seem useful to me for real world tasks. They are toys for super simple autocomplete.

show 2 replies