2.12 An Empirical Comparison of Sorting Algorithms

Which sorting algorithm is fastest? Asymptotic complexity analysis lets us distinguish between Θ(n2)\Theta(n^2) and Θ(nlogn)\Theta(n \log n) algorithms, but it does not help distinguish between algorithms with the same asymptotic complexity. Nor does asymptotic analysis say anything about which algorithm is best for sorting small lists. For answers to these questions, we can turn to empirical testing.

Table: Empirical comparison

Empirical comparison of sorting algorithms run on an Apple MacBook M3 Pro (2023 model). All times shown are seconds.

The first table shows Python implementations, using CPython version 3.11:

Sort 1K 10K 100K 1M 10M 100K sorted 100K reversed
Insertion sort 0.009 0.88 82 0.003 175
Bubble sort 0.020 2.08 211 97 280
Selection sort 0.006 0.68 68 66 66
Merge sort 0.001 0.009 0.105 1.26 17.4 0.084 0.087
Merge (cutoff 20) 0.000 0.006 0.077 1.00 15.1 0.048 0.071
Quicksort 0.000 0.006 0.074 0.95 12.6 0.045 0.047
Quick (cutoff 20) 0.000 0.005 0.066 0.87 11.8 0.030 0.032

The second table shows Java implementations, using OpenJDK version 22.0 (note that the arrays are 4 times larger than for the Python tests):

Sort 4K 40K 400K 4M 40M 400K sorted 400K reversed
Insertion sort 0.015 0.62 69 0.01 126
Bubble sort 0.018 1.46 312 35 175
Selection sort 0.023 0.96 112 41 79
Merge sort 0.001 0.010 0.14 1.06 13.2 0.11 0.12
Merge (cutoff 20) 0.001 0.009 0.12 1.03 12.3 0.10 0.10
Quicksort 0.001 0.008 0.06 0.66 8.2 0.04 0.03
Quick (cutoff 20) 0.001 0.007 0.06 0.57 8.0 0.04 0.04

The table above shows timing results for actual implementations of the sorting algorithms presented in this chapter. The algorithms compared include Insertion Sort, Bubble Sort, Selection Sort, Quicksort, and Mergesort.

For Quicksort and Mergesort, two versions are compared: the basic implementation, and an optimized version that falls back to insertion sort for sublists of length below 20.

Except for the rightmost columns, the input to each algorithm is a random array of integers. This affects the timing for some of the sorting algorithms. For example, Selection Sort is not being used to best advantage because the record size is small, so it does not get the best possible showing.

The various sorting algorithms are shown for arrays of sizes being multiples of 10. The final two columns of each table show the performance for the algorithms on inputs of size 100,000 (400,000 for the Java tests) where the numbers are in ascending (sorted) and descending (reverse sorted) order, respectively. These columns demonstrate best-case performance for some algorithms and worst-case performance for others. They also show that for some algorithms, the order of input has little effect.

These figures show a number of interesting results. As expected, the O(n2)O(n^2) sorts are quite poor performers for large arrays. Insertion Sort is by far the best of this group, unless the array is already reverse sorted. Optimized Quicksort is clearly the best overall algorithm for all but lists of 10 records. Even for small arrays, optimized Quicksort performs well because it does one partition step before calling Insertion Sort. In general, optimizing the various algorithms makes a noticeable improvement for larger array sizes.

2.12.1 Review questions: Comparison of sorting algorithms

Here are a few multiple choice questions that ask you to compare the sorting algorithms that we learned about in this chapter.

Answer TRUE or FALSE.

Selection Sort is generally faster than the Bubble Sort on the same input.

  • Selection Sort can be viewed as an optimization of Bubble Sort.
  • On each pass, Selection Sort just picks out the next record, while Bubble Sort has to do a lot of swapping.

How are Selection Sort and Quicksort similar?

  • Does Selection Sort use divide-and-conquer? No.
  • Can Selection Sort run in O(nlogn)O(n \log n) time? No.
  • How long does Quicksort need in the average case?
  • O(nlogn)O(n \log n)

(For the implementations as presented in this chapter:) Which of the following sorts is not stable?

  • [ ] Insertion Sort
  • [ ] Mergesort
  • Which of the mentioned algorithms will not maintain the relative order of records with equal keys?

What do we call the property of a sorting algorithm that gaurantees that records with the same key value occur in the same order in the sorted list as in the original, unsorted list?

  • “Linear” refers to an algorithm cost.
  • “External” refers to an algorithm to sort records on disk.
  • “Consistent” is not standard terminology related to sorting.
  • “Stable” sorting algorithms preserve the relative order of equal-valued keys.

A person sorting a hand of cards might reasonably use which sorting algorithm?

  • Think about each algorithm in turn, and imagine yourself trying to use it for sorting a hand of cards.
  • If you pick up the first two cards you might put them in order, then pick up the next card, and so on.
  • Or you might pick up all the cards to start, but then still sort them from left to right.
  • What sorting algorithm resembles this procedure?

Consider what happens if someone accidently calls sort on a file that is already sorted. Which of the following sorting methods will be the most efficient if the input is already in sorted order?

  • Which algorithm will make an iteration “quit early” depending on the values seen during comparisons?

Which of the following sorting methods will be best if the number of swaps done is the only measure of efficiency?

  • This sorting algorithms finds the minimum value, swaps it with the value in the first position, and repeats these steps for the remainder of the list.
  • It does no more than nn swaps.

Which of the following sorting algorithms has a worst case complexity of Θ(nlogn)\Theta(n \log n)?

  • Bubble Sort, Insertion Sort, and Selection Sort are referred to as “quadratic sorts” because of their worst-case time cost.
  • Quicksort’s worst-case behaviour comes when the pivot is chosen badly.

When Mergssort merges two sorted lists of sizes mm and nn into a sorted list of size m+nm+n, it requires how many comparisons in the worst case?

  • Merging compares two records, and selects the smaller.
  • Each record gets selected once, and each selection requires a comparison.
  • However, the last record does not need a comparison to be selected because there is no longer anything to compare it to.