logoalt Hacker News

stormfatheryesterday at 6:24 PM1 replyview on HN

I would try the Qwen models before LLaVa

Do you need the embeddings to be private? Or just the photos?


Replies

msgodelyesterday at 7:08 PM

For photo indexing I'd run CLIP directly and save on compute, no need to use a whole language model.