Models will always struggle with this specific task without tool use, because of the way they tokenize things. I think a bit of prompt engineering, asking it to spell out each work or giving it the ability to run a “contains e” python function on a lot of animal names it generates or searches for solves this.
Lots of local ai use cases I think are solvable similarly once local models get good at tool use and have the proper harness.
The problem with tool use is that I usually find I only need it for one component of a pipeline. So in this case mentally I would be tooling it as
cat /usr/share/dict/words | print_if_mammal | grep -v 'e'
but I don't know of a good way to incorporate an LLM into a pipeline like that (I know there's a Python API). What I'm actually interested in is "is this the name of a mammal?" but I don't know of the equivalent of a quiet "batch mode" at least for ollama (and of course performance).
I guess ultimately I would want to say "write a shell utility that accepts a line from standard input and prints it to standard output if that is the name of a mammal", and then use that utility in that pipeline. Or really to have an llmfilter utility that lets you do something like
cat /usr/share/dict/words | llmfilter "is this a mammal?" | grep -v "e"
and now that I've said that I think I'll try to make one.