I am partial to https://huggingface.co/Qwen/Qwen3-Embedding-0.6B nowadays.
Open weights, multilingual, 32k context.
Also matryoshka and the ability to guide matches by using prefix instructions on the query.
I have ~50 million sentences from english project gutenberg novels embedded with this.
It's junk compared to BGE M3 on my retrieval tasks
Also matryoshka and the ability to guide matches by using prefix instructions on the query.
I have ~50 million sentences from english project gutenberg novels embedded with this.