logoalt Hacker News

theanonymousonetoday at 7:35 AM1 replyview on HN

Wouldn't it be much more useful if the request received raw input (i.e. before feature extraction), and not the feature vector?


Replies

marcyb5sttoday at 9:01 AM

You can do that with Onnx. You can graft the preprocessing layers to the actual model [1] and then serve that. Honestly, I already thought that ONNX (CPU at least) was already low level code and already very optimized.

@Author - if you see this is it possible to add comparisons (ie "vanilla" inference latencies vs timber)?

[1] https://gist.github.com/msteiner-google/5f03534b0df58d32abcc... <-- A gist I put together in the past that goes from PyTorch to ONNX and grafts the preprocessing layers to the model, so you can pass the raw input.

show 1 reply