ONNX Runtime and CoreML May Silently Convert Your Model to FP16

61 points • by Two_hands • today at 12:27 AM • 8 comments • view on HN

Comments

This was an interesting read, thanks for sharing. I've recently been building something that uses Parakeet v2/v3 models, I'm using the parakeet-rs package (https://github.com/altunenes/parakeet-rs) which has had a few issues running models with CoreML (unrelated to the linked post), e.g. https://github.com/microsoft/onnxruntime/issues/26355

trashtensor • today at 5:37 AM

if you double click the coreml file in a mac and open xcode there is a profiler you can run. the profiler will show you the operations it's using and what the bit depth is.

yousifa • today at 5:54 AM

On the coreml side this is likely because the neural engine supports fp16 and offloading some/all layers to ANE significantly increases inference time and power usage when running models. You can inspect in the Xcode profiler to see what is running on each part of the device at what precision.

DiabloD3 • today at 1:23 AM

[flagged]

➕ show 3 replies

alt Hacker News

ONNX Runtime and CoreML May Silently Convert Your Model to FP16

Comments