Thanks, we are not large R&D lab, limited resources. We were working on a product with is a Loc...

aegis_camera • yesterday at 10:17 PM • 0 replies • view on HN

Thanks, we are not large R&D lab, limited resources. We were working on a product with is a Local VLM first BYOD when you want Video Security application, our users requested to have a MLX backend benchmark comparison, we tried hard to not deliver with Python in the application bundle, so we searched for a pure binary based MLX implementation the results shown we need to build one. It took us two weeks to get it working and we had been testing with multiple models. As a reference, you can see the result here: https://www.sharpai.org/benchmark/

Then we saw the announcement from Google about TurboQuant, it's so cool, so we started to integrate them (along with SSD/Flash streaming). It's a non-trivial process and thanks for your support and understanding. When we saw the mobile application alive with QWEN 3 1.7B model, we thought it worth.

If we get anything similar with well maintains, we will definitely adopt it since our target is the production delivery, if this one gets good support from the community, we will continue to support.

I think all the posts here gave us a reason to continue.

alt Hacker News