logoalt Hacker News

baobunlast Tuesday at 1:13 PM0 repliesview on HN

> it’s hard or impossible to define hash functions that map approximately equal keys to identical hash values. Why? Let’s take a look at what happens with the common epsilon comparison, for example: two numbers a, b are considered equal if |a – b| < ε. With this definition of equality all numbers between -ε/2; and ε/2 are considered equal and therefore must have the same hash value h. But the numbers between 0 and ε are also equal to each other, so their hash value must be h as well.

> If we continue in this manner we see that all all numbers must have the same hash value h regardless of the choice of ε. The idea of approximate comparison is unfortunately hard to reconcile with the non-approximate nature of hash functions.

I think there is at least one non-sequitur in there? The only thing you prove is that directly modeling hash keys on top of that notion of equality is not useful / the model is weak. I don't think equality is typically associative with epsilon comparison.

(The license of this comment forbids it from being used as support in favor of floats as hash keys)