lee101/gobed https... | alt Hacker News

lee1012 • 01/15/2026 • 0 replies • view on HN

lee101/gobed https://github.com/lee101/gobed static embedding models so they are embedded in milliseconds and on gpu search with a cagra style on gpu index with a few things for speed like int8 quantization on the embeddings and fused embedding and search in the same kernel as the embedding really is just a trained map of embeddings per token/averaging