data + plural = number
data - plural = research
king - crown = (didn't work... crown gets circled in red)
king - princess = emperor
king - queen = kingdom
queen - king = worker
king + queen = queen + king = kingdom
boy + age = (didn't work... boy gets circled in red)
man - age = woman
woman - age = newswoman
woman + age = adult female body (tied with man)
girl + age = female child
girl + old = female child
The other suggestions are pretty similar to the results I got in most cases. But I think this helps illustrate the curse of dimensionality (i.e. distances are ill-defined in high dimensional spaces). This is still quite an unsolved problem and seems a pretty critical one to resolve that doesn't get enough attention.Distance is extremely well defined in high dimensional spaces. That isn't the problem.
Yeah I did similar tests and got similar results.
Curious tool but not what I would call accurate.
I got a bunch of red stuff also. I imagine the author cached embeddings for some words but not really all that many to save on credits. I gave it mermaid - woman and got merman, but when I tried to give it boar + woman - man or ram + woman - man, it turns out it has never heard of rams or boars.
Can you elaborate on what the unsolved problem you're referring to is?
Such results are inherently limited because a same word can have different meanings depending on context.
The role of the Attention Layer in LLMs is to give each token a better embedding by accounting for context.
I think you need to do A-B+C types? A+B or A-B wouldn’t make much sense when the magnitude changes
hacker+news-startup = golfer
Ah yes, 女 + 子 = girl but if combined in a kanji you get 好 = like.
For fun, I pasted these into ChatGPT o4-mini-high and asked it for an opinion:
The results are surprisingly good, I don't think I could've done better as a human. But keep in mind that this doesn't do embedding math like OP! Although it does show how generic LLMs can solve some tasks better than traditional NLP.The prompt I used:
> Remember those "semantic calculators" with AI embeddings? Like "king - man + woman = queen"? Pretend you're a semantic calculator, and give me the results for the following: