4.3 Implementing Mergesort
Implementing Mergesort presents some technical difficulties. The pseudocode in the last section is quite vague and we have to figure out how to make it work for arrays.
First, splitting an input array into two subarrays is easy. We don’t
even have to copy any elements, but we can use the same idea as for
binary search: use left
and right
indices to
refer to a subarray. To split this subarray into two halves, we just
calculate the middle index, mid
=
(left+right)/2
.
The main function for sorting a subarray can now be written like this:
// Sort the subarray arr[left .. right]
function mergeSort(arr, left, right):
if left >= right: // Base case: Subarray length is ≤ 1
return
= int((left + right) / 2) // The midpoint is where the second half starts
mid -1) // Mergesort first half
mergeSort(arr, left, mid// Mergesort second half
mergeSort(arr, mid, right) // Merge the two sorted halves merge(arr, left, mid, right)
The initial call would be mergeSort(arr,0,arr.size-1)
,
which sorts the whole array arr
.
The difficulty comes when we want to merge the two sorted subarrays. This cannot be done in-place (or rather, it is very very complicated to do it in-place). So we need an auxiliary array which we can merge the elements into.
// Merge the sorted subarrays arr[left .. mid-1] and arr[mid .. right]
function merge(arr, left, mid, right):
= new Array
temp // Merge the two sublists into the auxiliary array
= left
i1 = mid
i2 for i in left .. right:
if i2 > right: // Right sublist exhausted
= arr[i1]
temp[i] = i1 + 1
i1 else if i1 > mid: // Left sublist exhausted
= arr[i2]
temp[i] = i2 + 1
i2 else if arr[i1] <= arr[i2]: // Get smaller value
= arr[i1]
temp[i] = i1 + 1
i1 else:
= arr[i2]
temp[i] = i2 + 1
i2 // Copy the elements back from the auxiliary array
for i in left .. right:
= temp[i] arr[i]
Here is a visualisation for the merge step.
Notice that the merge function will create a new auxiliary array
every time it is called. This is quite inefficient, because it takes
some time to allocate memory for a new array, which can be destroyed
directly when merge is finished. One simple optimisation is to create
one single auxiliary array before the first call, and reuse this array
in all invocations of merge. The only thing we would have to do is to
add an extra argument to mergeSort
and merge
,
for the reference to the auxiliary array. Then we can create a wrapper
function that takes care of the initialisation, and makes the first
recursive call:
function mergeSort(arr):
= new Array(arr.size)
temp 0, arr.size-1) mergeSort(arr, temp,