Don't use all-MiniLM-L6-v2 for new vector embeddings datasets. Yes, it's the open-weight...

minimaxir • today at 7:14 PM • 3 replies • view on HN

Don't use all-MiniLM-L6-v2 for new vector embeddings datasets.

Yes, it's the open-weights embedding model used in all the tutorials and it was the most pragmatic model to use in sentence-transformers when vector stores were in their infancy, but it's old and does not implement the newest advances in architectures and data training pipelines, and it has a low context length of 512 when embedding models can do 2k+ with even more efficient tokenizers.

For open-weights, I would recommend EmbeddingGemma (https://huggingface.co/google/embeddinggemma-300m) instead which has incredible benchmarks and a 2k context window: although it's larger/slower to encode, the payoff is worth it. For a compromise, bge-base-en-v1.5 (https://huggingface.co/BAAI/bge-base-en-v1.5) or nomic-embed-text-v1.5 (https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) are also good.

Replies

xfalcox • today at 7:28 PM

I am partial to https://huggingface.co/Qwen/Qwen3-Embedding-0.6B nowadays.

Open weights, multilingual, 32k context.

➕ show 2 replies

kaycebasques • today at 8:31 PM

One thing that's still compelling about all-Mini is that it's feasible to use it client-side. IIRC it's a 70MB download, versus 300MB for EmbeddingGemma (or perhaps it was 700MB?)

Are there any solid models that can be downloaded client-side in less than 100MB?

➕ show 2 replies

dangoodmanUT • today at 7:35 PM

yeah this, there's much better open weights models out there...

alt Hacker News

Replies