Very cool stuff. Love the focus on CPU-first.
Would also love to see some throughput numbers on basic VM setup.
Edit: there are some latency numbers in the paper https://arxiv.org/pdf/2507.18546
Zero-shot encoder models are so cool. I'll definitely be checking this out.
If you're looking for a zero-shot classifier, tasksource is in a similar vein.
https://huggingface.co/tasksource/ModernBERT-large-nli
I guess this is version 1, but still being maintained?
https://github.com/urchade/GLiNER
Very cool stuff. Love the focus on CPU-first.
Would also love to see some throughput numbers on basic VM setup.
Edit: there are some latency numbers in the paper https://arxiv.org/pdf/2507.18546