As a teenager I spent a weekend thinking that if binary search was good, because it cuts the search ...

jstanley • yesterday at 2:39 PM • 11 replies • view on HN

As a teenager I spent a weekend thinking that if binary search was good, because it cuts the search space in half at every step, then wouldn't a ternary search be better? Because we'd cut it into thirds at every step.

So instead of just comparing the middle value, we'd compare the one at the 1/3 point, and if that turns out to be too low then we compare the value at the 2/3 point.

Unfortunately although we cut the search space to 2/3 of what it was for binary search at each step (1/3 vs 1/2), we do 3/2 as many comparisons at each step (one comparison 50% of the time, two comparisons the other 50%), so it averages out to equivalence.

EDIT: See zamadatix reply, it's actually 5/3 as many comparisons because 2/3 of the time you have to do 2.

Replies

zamadatix • yesterday at 3:02 PM

This ternary approach doesn't actually average 3/2 comparisons per level:

- First third: 1 comparisons

- Second third: 2 comparisons

- Third third: 2 comparisons

(1+2+2)/3 = 5/3 average comparisons. I think the gap starts here at assuming it's 50% of the time because it feels like "either you do 1 comparison or 2" but it's really 33% of the time because "there is 1/3 chance it's in the 1st comparison and 2/3 chance it'll be 2 comparisons".

This lets us show ternary is worse in total average comparisons, just barely: 5/3*Log_3[n] = 1.052... * Log_2[n].

In other words, you end up with fewer levels but doing more comparisons (on average) to get to the end. This is true for all searches of this type (w/ a few general assumptions like the values being searched for are evenly distributed and the cost of the operations is idealized - which is where the main article comes in) where the number of splits is > 2.

➕ show 1 reply

GuB-42 • yesterday at 3:51 PM

It turns out that teenager you had something.

Not for the ternary version of the binary search algorithm, because what you had is just a skewed binary search, not an actual ternary search. Because comparisons are binary by nature, any search algorithm involving comparisons are a type of binary search, and any choice other than the middle element is less efficient in terms of algorithmic complexity, though in some conditions, it may be better on real hardware. For an actual ternary search, you need a 3-way comparison as an elementary operation.

Where it gets interesting is when you consider "radix efficiency" [1], for which the best choice is 3, the natural number closest to e. And it is relevant to tree search, that is, a ternary tree may be better than a binary tree.

[1] https://en.wikipedia.org/wiki/Optimal_radix_choice

eggprices • yesterday at 3:11 PM

When you can't seek quickly, e.g. on a disk, you can use a B-tree with say 128-way search. Fetching 128 keys doesn't cost much more than fetching 1 but it saves an additional 7 fetches.

ack_complete • yesterday at 3:44 PM

Note that CPUs have also gotten dramatically wider in both execution width and vector capability since you were a teenager. The increased throughput shifts the balance more toward being able to burn operations to reduce dependency chains. It's possible for your idea to have been both non-viable on the CPUs at the time and more viable on CPUs now.

bryanlarsen • yesterday at 2:42 PM

Did you continue by fantasizing about CPU's that contain ternary comparators?

nkurz • yesterday at 2:51 PM

> Unfortunately although we cut the search space to 2/3 of what it was for binary search at each step (1/3 vs 1/2), we do 3/2 as many comparisons at each step (one comparison 50% of the time, two comparisons the other 50%), so it averages out to equivalence.

True, but is there some particular reason that you want to minimize the number of comparisons rather than have a faster run time? Daniel doesn't overly emphasize it, but as he mentions in the article: "The net result might generate a few more instructions but the number of instructions is likely not the limiting factor."

The main thing this article shows is that (at least sometimes on some processors) a quad search is faster than a binary search _despite_ the fact that that it performs theoretically unnecessary comparisons. While some computer scientists might scoff, I'd bet heavily that an optimized ternary search could also frequently outperform.

➕ show 1 reply

compiler-guy • yesterday at 4:37 PM

This idea is closely related to the famous "Stooge Sort", which is basically quicksort with the pivot at 1/3 rather than 1/2. Naively, one might think it is faster than Quicksort, but of course it isn't.

For years--maybe still?--analyzing its running time was a staple of the first or second problem set in a college-level "Introduction to Algorithms" course.

https://en.wikipedia.org/wiki/Stooge_sort

madcaptenor • yesterday at 2:50 PM

Isn't it a bit better on average, although not as much as you'd hoped? For example 19 steps of binary search get you down to 1/524288 of the original search space with 19 comparisons. 12 steps of ternary search get you down to 1/3^12 = 1/531441 of the original search space with, on average, 12 * 3/2 = 18 comparisons.

➕ show 1 reply

bena • yesterday at 3:01 PM

Imagine if you split the search space N times, no middles. Then you could just compare the value.

matchooo0 • yesterday at 4:41 PM

[dead]

alt Hacker News

Replies