You can do that with Onnx. You can graft the preprocessing layers to the actual model [1] and then s...

marcyb5st • today at 9:01 AM • 1 reply • view on HN

You can do that with Onnx. You can graft the preprocessing layers to the actual model [1] and then serve that. Honestly, I already thought that ONNX (CPU at least) was already low level code and already very optimized.

@Author - if you see this is it possible to add comparisons (ie "vanilla" inference latencies vs timber)?

[1] https://gist.github.com/msteiner-google/5f03534b0df58d32abcc... <-- A gist I put together in the past that goes from PyTorch to ONNX and grafts the preprocessing layers to the model, so you can pass the raw input.

Replies

kossisoroyce • today at 9:38 AM

I'll check this out as soon as I am at my desk.

alt Hacker News

Replies