logoalt Hacker News

qsortyesterday at 5:09 PM3 repliesview on HN

Only if you assume a finite alphabet and bounded length. Relax either and you're back to O(n log n) for a fully general solution. Examples of both: tuples and strings.

(There's also the problem of how you define your computational model. You can do better than O(n log n) in transdichotomous models. I'm assuming the hand-wavy, naive model the average algorithms class goes along with.)


Replies

SkiFire13yesterday at 5:55 PM

> Only if you assume a finite alphabet and bounded length

You can generally reduce the problem to a finite alphabet by taking the finite subset that actually appears in the input.

If you have an unbounded length then you can make sorting O(l n) where `l` is a bound on the lengths of your input. It's still linear in n, and also better than the O(l n logn) you would with traditional comparison based algorithms once you factor in the O(l) complexity of the comparison function for such elements.

show 2 replies
shiandowyesterday at 7:57 PM

Both are true in practice, so not unreasonable. For graph weights that is, not sorting.

That said the finite alphabet and bounded length requirements can be softened a bit. Even for general sorting algorithms.

I mean, for the kind of lexicographic sotable data we're talking about you can basically pick a convenient alphabet size without cost.

And unbounded length is not that big an obstruction. Sure you are going to need O(n log(n)) comparisons. But you can't compare data of unbounded length in constant time anyway. In the end you end up taking an amount of time that is at least proportional to the amount of data, which is optimal up to a constant factor. And if you fiddle with radix sort enough you can get it within something similar.

Basic ASCII strings and tuples aren't that big an obstruction. Fractions are more complicated.

Really the O(n log(n)) for comparison based sorts and O(N) for radix sort mean something different. One is the number of comparisons to the number of elements, and the other closer to the number of operations per amount of data. Though that assumes O(1) swaps, which is technically incorrect for data that doesn't fit a 64 bit computer.

ogogmadyesterday at 5:15 PM

In the most extreme case of this, you can use Counting Sort. Tangential to this, Spaghetti Sort makes me wonder about which parts of CS (especially the data structures, like arrays) are objective or just accidental.

The transdichotomous model looks interesting.

show 1 reply