I am dipping my toes in programming. But I can't follow this without graphical representations of the tables as the author (brilliantly) walks through it.
I don't understand the part about key removal. Is it correct to shift values to the left? Can't it break lookups for keys inserted into their natural positions?
I always find it interesting how often the simplest hash table layouts end up performing best in real workloads. Once you avoid pointer chasing and keep everything in a compact array, CPU caches do most of the heavy lifting.
It’s also a good reminder that clarity of layout often beats more “clever” designs, especially when the dataset fits comfortably in memory.
This is a smart implementation of Robin Hood hashing I am not aware of. In my understanding, a standard implementation keeps the probe length of each entry. This one avoids that due to its extra constraints. I don't quite understand the following strategy, though
> To meet property (3) [if the key 0 is present, its value is not 0] ... "array index plus one" can be stored rather than "array index".
If hash code can take any value in [0,2^32), how to define a special value for empty buckets? The more common solution is to have a special key, not a special hash code, for empty slots, which is easier to achieve. In addition, as the author points out, supporting generic keys requires to store 32-bit hash values. With the extra 4 bytes per bucket, it is not clear if this implementation is better than plain linear probing (my favorite). The fastest hash table implementations like boost and abseil don't often use Robin Hood hashing.
Awesome blog! Looking at the code I feel like there’s a kindred soul behind that keyboard, but there’s no About page afaict. Who beeth this mysterious writer?
> If the Zba extension is present, sh3add.uw is a single instruction for zero-extending idx from 32 bits to 64, multiplying it by sizeof(uint64_t), and adding it to slots.
Yay, we've got an equivalent of SIB byte but as three (six?) separate opcodes. Well, sub-opcodes.
It's a shame though that Zcmp extension didn't get into RVA23 even as an optional extension.
Is there a specific reason to store the key + value as an `uint64_t` instead of just using a struct like this?
struct slot {
uint32_t key;
uint32_t value;
}This hash table is pretty good. It has at best one memory read if there’s no collision. Randomized key might introduce any level of key collision though.
Used Gemini to explain this. Very interesting and I think Gemini did a good job explaining the Robin hood fairness mechanism
> The table occupies at most 32 GiB of memory.
This constraint allows making a linear array of all the 4 billion values, with the key as array index, which fits in 16 GiB. Another 500 MiB is enough to have a bit indicating present or not for each.
Perhaps text strings as keys and values would give a more interesting example...