2.13 Lower Bounds for Sorting

By now you have seen many analyses for algorithms. These analyses generally define the upper and lower bounds for algorithms in their worst and average cases. For many of the algorithms presented so far, analysis has been easy. This module considers a more difficult task: An analysis for the cost of a problem as opposed to an algorithm. The upper bound for a problem can be defined as the asymptotic cost of the fastest known algorithm. The lower bound defines the best possible cost for any algorithm that solves the problem, including algorithms not yet invented. Once the upper and lower bounds for the problem meet, we know that no future algorithm can possibly be (asymptotically) more efficient.

A simple estimate for a problem’s lower bound can be obtained by measuring the size of the input that must be read and the output that must be written. Certainly no algorithm can be more efficient than the problem’s I/O time. From this we see that the sorting problem cannot be solved by any algorithm in less than Ω(n)\Omega(n) time because it takes at least nn steps to read and write the nn values to be sorted. Alternatively, any sorting algorithm must at least look at every input value to recognize whether the input values are in sorted order. So, based on our current knowledge of sorting algorithms and the size of the input, we know that the problem of sorting is bounded by Ω(n)\Omega(n) and O(nlogn)O(n \log n).

Computer scientists have spent much time devising efficient general-purpose sorting algorithms, but no one has ever found one that is faster than O(nlogn)O(n \log n) in the worst or average cases. Should we keep searching for a faster sorting algorithm? Or can we prove that there is no faster sorting algorithm by finding a tighter lower bound?

This section presents one of the most important and most useful proofs in computer science: No sorting algorithm based on key comparisons can possibly be faster than Ω(nlogn)\Omega(n \log n) in the worst case. This proof is important for three reasons. First, knowing that widely used sorting algorithms are asymptotically optimal is reassuring. In particular, it means that you need not bang your head against the wall searching for an O(n)O(n) sorting algorithm. (Or at least not one that is in any way based on key comparisons. But it is hard to imagine how to sort without any comparisons.) Second, this proof is one of the few non-trivial lower-bounds proofs that we have for any problem; that is, this proof provides one of the relatively few instances where our lower bound is tighter than simply measuring the size of the input and output. As such, it provides a useful model for proving lower bounds on other problems. Finally, knowing a lower bound for sorting gives us a lower bound in turn for other problems whose solution could be made to work as the basis for a sorting algorithm. The process of deriving asymptotic bounds for one problem from the asymptotic bounds of another is called a reduction.

All of the sorting algorithms we have studied make decisions based on the direct comparison of two key values. For example, Insertion Sort sequentially compares the value to be inserted into the sorted list until a comparison against the next value in the list fails.

The proof that any comparison sort requires Ω(nlogn)\Omega(n \log n) comparisons in the worst case is structured as follows. First, comparison-based decisions can be modeled as the branches in a tree. This means that any sorting algorithm based on comparisons between records can be viewed as a binary tree whose nodes correspond to the comparisons, and whose branches correspond to the possible outcomes. Next, the minimum number of leaves in the resulting tree is shown to be the factorial of nn. Finally, the minimum depth of a tree with n!n! leaves is shown to be in Ω(nlogn)\Omega(n \log n).

Before presenting the proof of an Ω(nlogn)\Omega(n \log n) lower bound for sorting, we first must define the concept of a decision tree. A decision tree is a binary tree that can model the processing for any algorithm that makes binary decisions. Each (binary) decision is represented by a branch in the tree. For the purpose of modeling sorting algorithms, we count all comparisons of key values as decisions. If two keys are compared and the first is less than the second, then this is modeled as a left branch in the decision tree. In the case where the first value is greater than the second, the algorithm takes the right branch.

Here is a Visualization that illustrates decision trees and the sorting lower bound proof.

Any sorting algorithm requiring Ω(nlogn)\Omega(n \log n) comparisons in the worst case requires Ω(nlogn)\Omega(n \log n) running time in the worst case. Because any sorting algorithm requires Ω(nlogn)\Omega(n \log n) running time, the problem of sorting also requires Ω(nlogn)\Omega(n \log n) time. We already know of sorting algorithms with O(nlogn)O(n \log n) running time, so we can conclude that the problem of sorting requires Θ(nlogn)\Theta(n \log n) time. As a corollary, we know that no comparison-based sorting algorithm can improve on existing Θ(nlogn)\Theta(n \log n) time sorting algorithms by more than a constant factor.

2.13.1 Review questions: Lower bounds for sorting

Here are some review questions to check that you understand this proof.

Answer TRUE or FALSE.

The proof that the lower bound for the sorting problem is Ω(nlogn)\Omega(n \log n) technically only applies to comparison-based sorting. This means that we can find other approaches (such as radix sort) to solve the problem faster.

  • Does Radix Sort compare?
  • While Radix Sort does not directly compare the keys of two records against each other, it does do a comparison for each digit of each key.

The upper bound for a problem is defined to be:

  • For a given problem, it is possible to write an algorithm that is as bad as we want. So defining a bound in terms of bad algorithms is not useful.
  • We look at the worst-case input to determine the worst-case upper bound for an algorithm, not a problem.
  • To get the upper bound for a problems, we compare the cost for each algorithm that we know.
  • The upper bound of the problem is the upper bound of the best known algorithm.

The lower bound for a problem is defined to be:

  • Upper and lower bounds are used to describe to what we know, and we do not always know that these are the same for a given algorithm.
  • Lower bound refers to the class of inputs that we are considering (best, average, or worst-case). So it does NOT necessarily have anything to do with the best-case input.
  • The cost of the best algorithm that we know sets a limit on the upper bound of the problem, not the lower bound.
  • The lower bound refers to the best possible cost (for the class of inputs, such as worst-case) for ANY algorithm that solves the problem.

If the upper and lower bounds for a problem meet then:

  • The universe hasn’t exploded yet.
  • Just because the bounds meet, it does not mean that the cost is Θ(n)\Theta(n).
  • We might need to implement an algorithm even if we are not certain that it is the best.
  • When the bounds meet, then we know for certain if an algorithm to solve that problem is best (within a constant factor).

Which of the following is NOT relevant to the sorting problem lower bounds proof?

  • The proof uses decision trees to model sorting algorithms.
  • The number of permutations affects the size of the decision tree.
  • Given a tree of a certain size, we can compute its minimum depth.
  • The cost of Bubble Sort does not affect the cost of other sorts.

logn!\log n! is:

  • Since n!nnn! \leq n^n, it follows that logn!lognn=nlogn\log n! \leq \log n^n = n \log n. So that eliminates anything bigger than nlognn \log n.
  • There are nn terms in n!n!, and you need to take the log of each of them. Since they have some size, it has to be much more than logn\log n.
  • It turns out to be worse than just nn.

A decision tree is:

  • A decision tree is not a search tree.
  • A decision tree is a model for behavior, not an algorithm.