logoalt Hacker News

Playing with Vision Embeddings

40 pointsby prestojlast Friday at 2:54 PM3 commentsview on HN

Comments

jcattletoday at 7:21 AM

Very nice visualizations, thanks for that!

One thing I still struggle with in my head is how these vision embeddings can then be used to give LLMs eyes.

Because you somehow need a giant training set which describes images in natural language, no? Is that actually how it works, or is there some smart trick so you don't need to pay labellers a bunch of money to look at pictures and describe them.

show 1 reply
SkitterKherpitoday at 8:43 AM

[dead]