logoalt Hacker News

hinkleyyesterday at 6:44 PM2 repliesview on HN

I swear I read an article about treaps but instead of being used to balance the tree, they used the weights to Huffman encode the search depth to reduce the average access time for heterogenous fetch frequencies.

I did not bookmark it and about twice a year I go searching for it again. Some say he’s still searching to this day.


Replies

ssivarkyesterday at 10:26 PM

Huffman coding assumes your corpus is a string of discrete elements (symbol strings) without any continuous structure (eg. topology/geometry). With that fairly mild assumption, it gives a recipe to reorganize (transform/encode) your data as a prefix-tree, to minimize the bits of information needed to communicate the contents of your corpus i.e. reducing (on average) the bits of information you need to identify a specific item. Eg. To go back to the analogy from my previous comment above... if the function you are inverting via search has long plateaus then you could simply front-load those as guesses; that's roughly the spirit of Huffman coding, except it eschews monotonicity.