shouldn't be hard. what backend/hardware are you interested in running this with? i'll add an example for using C++ onnx model. btw check out roadmap, our inference engine will be out 1-2 weeks and it is expected to be faster than onnx.
desktop CPUs running inference on a single background thread would be the ideal case for what I'm considering.
desktop CPUs running inference on a single background thread would be the ideal case for what I'm considering.