logoalt Hacker News

nltoday at 1:56 AM2 repliesview on HN

If you are going to go to the bother of fine tuning for trivial problems like subject classification then I think you'll find Scikit Learn with a SGDClassifier on 2-grams will do probably just as well and be under 1MB for the trained classifier.

You can train it in under a minute, and it will work perfectly well on embedded devices.

Small LLMs are good choices for text classification in two cases:

- If you next to provide in-context examples and classifier based on them.

- Your classification goes beyond simple subject-type classifiers. For example, multiple choice question answering is classification where small LLM will work but traditional ML methods won't/


Replies

djsjajahtoday at 2:46 AM

Not with 800 examples. If you are going to consider an ngram model, I think you are better off getting a frontier llm to write you an absurd regex.

show 1 reply
brokenseguetoday at 4:30 AM

there are models between 2-grams and 600m param models that would be good options. i don't expect a 2-gram to do very well here. also i'm not sure why this model isn't a fine choice if it solves their problem

show 1 reply