Models will always struggle with this specific task without tool use, because of the way they tokeni...

mattnewton • today at 7:06 AM • 1 reply • view on HN

Models will always struggle with this specific task without tool use, because of the way they tokenize things. I think a bit of prompt engineering, asking it to spell out each work or giving it the ability to run a “contains e” python function on a lot of animal names it generates or searches for solves this.

Lots of local ai use cases I think are solvable similarly once local models get good at tool use and have the proper harness.

Replies

bandrami • today at 7:30 AM

The problem with tool use is that I usually find I only need it for one component of a pipeline. So in this case mentally I would be tooling it as

cat /usr/share/dict/words | print_if_mammal | grep -v 'e'

but I don't know of a good way to incorporate an LLM into a pipeline like that (I know there's a Python API). What I'm actually interested in is "is this the name of a mammal?" but I don't know of the equivalent of a quiet "batch mode" at least for ollama (and of course performance).

I guess ultimately I would want to say "write a shell utility that accepts a line from standard input and prints it to standard output if that is the name of a mammal", and then use that utility in that pipeline. Or really to have an llmfilter utility that lets you do something like

cat /usr/share/dict/words | llmfilter "is this a mammal?" | grep -v "e"

and now that I've said that I think I'll try to make one.

alt Hacker News

Replies