Show HN: Timber – Ollama for classical ML models, 336x faster than Python

137 points • by kossisoroyce • today at 12:57 AM • 27 comments • view on HN

Comments

Since generative AI exploded, it's all anyone talks about. But traditional ML still covers a vast space in real-world production systems. I don't need this tool right now, but glad to see work in this area.

➕ show 1 reply

mehdibl • today at 3:28 AM

Ollama is quite a bad example here. Despite popular, it's a simple wrapper and more and more pushed by the app it wraps llama.cpp.

Don't understand here the parallel.

➕ show 1 reply

brokensegue • today at 3:50 AM

"classical ML" models typically have a more narrow range of applicability. in my mind the value of ollama is that you can easily download and swap-out different models with the same API. many of the models will be roughly interchangeable with tradeoffs you can compute.

if you're working on a fraud problem an open-source fraud model will probably be useless (if it even could exist). and if you own the entire training to inference pipeline i'm not sure what this offers? i guess you can easily swap the backends? maybe for ensembling?

theanonymousone • today at 7:35 AM

Wouldn't it be much more useful if the request received raw input (i.e. before feature extraction), and not the feature vector?

➕ show 1 reply

rudhdb773b • today at 4:00 AM

If the focus is performance, why use a separate process and have to deal with data serialization overhead?

Why not a typical shared library that can be loaded in python, R, Julia, etc., and run on large data sets without even a memory copy?

➕ show 2 replies

Dansvidania • today at 3:22 AM

Can’t check it out yet, but the concept alone sounds great. Thank you for sharing.

palashkulsh • today at 6:41 AM

Nice idea, i needed something like it

o10449366 • today at 5:28 AM

Can you tell us more about the motivation for this project? I'm very curious if it was driven by a specific use case.

I know there are specialized trading firms that have implemented projects like this, but most industry workflows I know of still involve data pipelines with scientists doing intermediate data transformations before they feed them into these models. Even the c-backed libraries like numpy/pandas still explicitly depend on the cpython API and can't be compiled away, and this data feed step tends to be the bottleneck in my experience.

That isn't to say this isn't a worthy project - I've explored similar initiatives myself - but my conclusion was that unless your data source is pre-configured to feed directly into your specific model without any intermediate transformation steps, optimizing the inference time has marginal benefit in the overall pipeline. I lament this as an engineer that loves making things go fast but has to work with scientists that love the convenience of jupyter notebooks and the APIs of numpy/pandas.

jnstrdm05 • today at 2:40 AM

I have been waiting for this! Nice

OutOfHere • today at 7:24 AM

It would be safer to use a Zig or Rust or Nim target. C risks memory-unsafe behavior. The risk profile is even bigger for vibe-coded implementations.

➕ show 1 reply

alt Hacker News

Show HN: Timber – Ollama for classical ML models, 336x faster than Python

Comments