At least in theory. If the model is the same, the embeddings can be reused by the model rather than ...

yazaddaruvala • 07/31/2025 • 3 replies • view on HN

At least in theory. If the model is the same, the embeddings can be reused by the model rather than recomputing them.

I believe this is what they mean.

In practice, how fast will the model change (including tokenizer)? how fast will the vector db be fully backfilled to match the model version?

That would be the “cache hit rate” of sorts and how much it helps likely depends on some of those variables for your specific corpus and query volumes.

Replies

stillpointlab • 07/31/2025

> the embeddings can be reused by the model

I can't find any evidence that this is possible with Gemini or any other LLM provider.

➕ show 1 reply

d4rkp4ttern • 08/01/2025

This can’t be what they mean. Even if this was somehow possible, Embeddings lose information and are not reversible, I.e embeddings do not magically compress actual text into a vector in a way that a model can implicitly recover the source text from the vector.

ivape • 07/31/2025

LLMs can’t take embeddings (unless I’m really confused). Even if it could take embeddings, the embeddings would have lost all word sequence and structure (wouldn’t make sense to the LLM).

alt Hacker News

Replies