logoalt Hacker News

Show HN: Semantic Calculator (king-man+woman=?)

176 pointsby nxa05/14/2025172 commentsview on HN

I've been playing with embeddings and wanted to try out what results the embedding layer will produce based on just word-by-word input and addition / subtraction, beyond what many videos / papers mention (like the obvious king-man+woman=queen). So I built something that doesn't just give the first answer, but ranks the matches based on distance / cosine symmetry. I polished it a bit so that others can try it out, too.

For now, I only have nouns (and some proper nouns) in the dataset, and pick the most common interpretation among the homographs. Also, it's case sensitive.


Comments

godelski05/14/2025

  data + plural = number
  data - plural = research
  king - crown = (didn't work... crown gets circled in red)
  king - princess = emperor
  king - queen = kingdom
  queen - king = worker
  king + queen = queen + king = kingdom
  boy + age = (didn't work... boy gets circled in red)
  man - age = woman
  woman - age = newswoman
  woman + age = adult female body (tied with man)
  girl + age = female child
  girl + old = female child
The other suggestions are pretty similar to the results I got in most cases. But I think this helps illustrate the curse of dimensionality (i.e. distances are ill-defined in high dimensional spaces). This is still quite an unsolved problem and seems a pretty critical one to resolve that doesn't get enough attention.
show 9 replies
montebicyclelo05/14/2025

> king-man+woman=queen

Is the famous example everyone uses when talking about word vectors, but is it actually just very cherry picked?

I.e. are there a great number of other "meaningful" examples like this, or actually the majority of the time you end up with some kind of vaguely tangentially related word when adding and subtracting word vectors.

(Which seems to be what this tool is helping to illustrate, having briefly played with it, and looked at the other comments here.)

(Btw, not saying wordvecs / embeddings aren't extremely useful, just talking about this simplistic arithmetic)

show 7 replies
spindump893005/14/2025

First off, this interface is very nice and a pleasure to use, congrats!

Are you using word2vec for these, or embeddings from another model?

I also wanted to add some flavor since it looks like many folks in this thread haven't seen something like this - it's been known since 2013 that we can do this (but it's great to remind folks especially with all the "modern" interest in NLP).

It's also known (in some circles!) that a lot of these vector arithmetic things need some tricks to really shine. For example, excluding the words already present in the query[1]. Others in this thread seem surprised at some of the biases present - there's also a long history of work on that [2,3].

[1] https://blog.esciencecenter.nl/king-man-woman-king-9a7fd2935...

[2] https://arxiv.org/abs/1905.09866

[3] https://arxiv.org/abs/1903.03862

show 1 reply
antidnan05/14/2025

Neat! Reminds me of infinite craft

https://neal.fun/infinite-craft/

show 1 reply
lcnPylGDnU4H9OF05/14/2025

Some of these make more sense than others (and bookshop is hilarious even if it's only the best answer by a small margin; no shade to bookshop owners).

  map - legend = Mercator projection
  noodle - wheat = egg noodle
  noodle - gluten = tagliatelle
  architecture - calculus = architectural style
  answer - question = comment
  shop - income = bookshop
  curry - curry powder = cuisine
  rice - grain = chicken and rice
  rice + chicken = poultry
  milk + cereal = grain
  blue - yellow = Fiji
  blue - Fiji = orange
  blue - Arkansas + Bahamas + Florida - Pluto = Grenada
show 2 replies
jumploops05/14/2025

This is super neat.

I built a game[0] along similar lines, inspired by infinite craft[1].

The idea is that you combine (or subtract) “elements” until you find the goal element.

I’ve had a lot of fun with it, but it often hits the same generated element. Maybe I should update it to use the second (third, etc.) choice, similar to your tool.

[0] https://alchemy.magicloops.app/

[1] https://neal.fun/infinite-craft/

lightyrs05/14/2025

I don't get it but I'm not sure I'm supposed to.

    life + death = mortality
    life - death = lifestyle

    drug + time = occasion
    drug - time = narcotic

    art + artist + money = creativity
    art + artist - money = muse

    happiness + politics = contentment
    happiness + art      = gladness
    happiness + money    = joy
    happiness + love     = joy
show 2 replies
__MatrixMan__05/14/2025

Here's a challenge: find something to subtract from "hammer" which does not result in a word that has "gun" as a substring. I've been unsuccessful so far.

show 8 replies
grey-area05/14/2025

As you might expect from a system with knowledge of word relations but without understanding or a model of the world, this generates gibberish which occasionally sounds interesting.

nxa05/14/2025

This might be helpful: I haven't implemented it in the UI, but from the API response you can see what the word definitions are, both for the input and the output. If the output has homographs, likeliness is split per definition, but the UI only shows the best one.

Also, if it gets buried in comments, proper nouns need to be capitalized (Paris-France+Germany).

I am planning on patching up the UI based on your feedback.

GrantMoyer05/15/2025

These are pretty good results. I messed around with a dumber and more naive version of this a few years ago[1], and it wasn't easy to get sensinble output most of the time.

[1]: https://github.com/GrantMoyer/word_alignment

rdlw05/14/2025

I've always wondered if there's s way to find which vectors are most important in a model like this. The gender vector man-woman or woman-man is the one always used in examples, since English has many gendered terms, but I wonder if it's possible to generate these pairs given the data. Maybe to list all differences of pairs of vectors, and see if there are any clusters. I imagine some grammatical features would show up, like the plurality vector people-person, or the past tense vector walked-walk, but maybe there would be some that are surprisingly common but don't seem to map cleanly to an obvious concept.

Or maybe they would all be completely inscrutable and man-woman would be like the 50th strongest result.

ale4205/15/2025

Not what it's meant for, I guess, but it's not very strong at chemistry ;-)

  salt - chlorine + potassium = sodium
  chlorine + sodium = rubidium
  water - hydrogen = tap water
It also has some other interesting outputs:

  woman + man = adult female body (already reported by someone else)
  man - hand = woman
  woman - hand = businesswoman
  businessman - male + female = industrialist
  telephone + antenna = television equipment
  olive oil - oil = hearth money
anonu05/15/2025

Reminds me of the very annoying word game https://contexto.me/en/

skeptrune05/14/2025

This is super fun. Offering the ranked matches makes it significantly more engaging than just showing the final result.

ericdiao05/14/2025

Interesting: parent + male = female (83%)

Can not personally find the connection here, was expecting father or something.

show 1 reply
afandian05/14/2025

There was a site like this a few years ago (before all the LLM stuff kicked off) that had this and other NLP functionality. Styling was grey and basic. That’s all I remember.

I’ve been unable to find it since. Does anyone know which site I’m thinking of?

show 1 reply
clbrmbr05/15/2025

A few favorites:

wine - beer = grape juice

beer - wine = bowling

astrology - astronomy + mathematics = arithmancy

galaxyLogic05/15/2025

What about starting with the result and finding set of words that when summed together give that result?

That could be seen as trying to find the true "meaning" of a word.

nxa05/15/2025

artificial intelligence - bullsh*t = computer science (34%)

show 1 reply
tiborsaas05/15/2025

I've tried to get to "garage", but failed at a few attempts, ChatGPT's ideas also seemed reasonable, but failed. Any takers? :)

show 1 reply
fallinghawks05/14/2025

goshawk-cocaine = gyrfalcon , which is funny if you know anything about goshawks and gyrfalcons

(Goshawks are very intense, gyrs tend to be leisurely in flight.)

neom05/14/2025

cool but not enough data to be useful yet I guess. Most of mine either didn't have the words or were a few % off the answer, vehicle - road + ocean gave me hydrosphere, but the other options below were boat, ship, etc. Klimt almost made it from Mozart - music + painting. doctor - hospital + school = teacher, nailed it.

Getting to cornbread elegantly has been challenging.

yigitkonur3505/14/2025

shows how bad embeddings are in a practical way

ignat_24463905/15/2025

Huh, that's strange, I wanted to check whether your embeddings have biases, but I cannot use "white" word at all. So I cannot get answer to "man - white + black = ?".

But if I assume the biased answer and rearrange the operands, I get "man - criminal + black = white". Which clearly shows, how biased your embeddings are!

Funny thing, fixing biases and ways to circumvent the fixes (while keeping good UX) might be much challenging task :)

TZubiri05/14/2025

I'm getting Navralitova instead of queen. And can't get other words to work, I get red circles or no answer at all.

show 1 reply
Jimmc41405/14/2025

dog - cat = paleolith

paleolith + cat = Paleolithic Age

paleolith + dog = Paleolithic Age

paleolith - cat = neolith

paleolith - dog = hand ax

cat - dog = meow

Wonder if some of the math is off or I am not using this properly

show 1 reply
andrelaszlo05/15/2025

    hand - arm + leg = vertebrate foot
    snowman - man =  snowflake
    snowman - snow = snowbank
e____g05/15/2025

man - intelligence = woman (36%)

woman + intelligence = man (77%)

Oof.

wdutch05/15/2025

It's interesting that I find loops. For example

car + stupid = idiot, car + idiot = stupid

nikolay05/14/2025

Really?!

  man - brain = woman
  woman - brain = businesswoman
show 6 replies
cabalamat05/14/2025

What does it mean when it surrounds a word in red? Is this signalling an error?

show 3 replies
dtj112305/15/2025

"man-intelligence=woman" is a particularly interesting result.

ericdiao05/14/2025

wine - alcohol = grape juice (32%)

Accurate.

coolcase05/15/2025

Oh you have all the damn words. Even the Ricky Gervais ones.

downboots05/14/2025

mathematics - Santa Claus = applied mathematics

hacker - code = professional golf

krishna-vakx05/15/2025

for founders :

love + time = commitment

boredom + curiosity = exploration

vision + execution = innovation

resilience - fear = courage

ambition + humility = leadership

failure + reflection = learning

knowledge + application = wisdom

feedback + openness = improvement

experience - ego = mastery

idea + validation = product-market fit

matallo05/14/2025

uncle + aunt = great-uncle (91%)

great idea, but I find the results unamusing

show 1 reply
havkom05/15/2025

I tried:

-red

and:

red-red-red

But it did not work and did not get any response. Maybe I am stupid but should this not work?

hagen_dogs05/15/2025

fluid + liquid = solid (85%) -- didn't expect that

blue + red = yellow (87%) -- rgb, neat

black + {red,blue,yellow,green} = white 83% -- weird

show 1 reply
MYEUHD05/14/2025

king - man + woman = queen

queen - woman + man = drone

show 1 reply
Glyptodon05/15/2025

Car - Wheel(s) doesn't really have results I'd guess at (boat, sled, etc.). Just specific four wheeled vehicles.

hello_computer05/15/2025

doesn’t do anything on my iphone

Finbel05/15/2025

London-England+France=Maupassant

firejake30805/14/2025

King-man+woman=Navratilova, who is apparently a Czech tennis player. Apparently, it's very case-sensitive. Cool idea!

show 1 reply
cosmicgadget05/15/2025

  car + dragon = panzer
maxcomperatore05/15/2025

Just use a LLM api to generate results, it will be far better and more accurate than a weird home cooked algorithm

darepublic05/15/2025

man - courage = husband

kylecazar05/14/2025

Woman + president = man

🔗 View 22 more comments