WEEK10: recursion --------------------------------------------------------------- F: merge sort
Here are some interesting merge sort links:
MERGE SORT:
- merge sort uses a divide-and-conquer method to sort a list:
given a list, split it in two (if the size of the list > 1)
sort each of the two halves of the list
merge the two sorted halves back into the original list
- this is easy to express using recursion:
def mergeSort(L):
n = len(L)
if n > 1:
L1 = L[0:n/2]
L2 = L[n/2:]
mergeSort(L1)
mergeSort(L2)
merge(L1,L2,L)
- that last step assumes we have a function called merge that
merges the two sorted lists back into one big sorted list
YOUR TURN:
- copy /home/jk/inclass/sorts.py and add the mergeSort function
- run sorts.py and see how long mergeSort takes for N=5000,10000,20000
(you may want to comment our selectionSort and insertionSort
for the N=20000 run)
Here's what I get:
N = 5000 10000 20000
L.sort 0.0053 0.0114 0.0102
selection 7.1494 14.5595 51.2008
insertion 11.7596 19.9309 77.8990
merge 0.1288 0.1150 0.2420
So merge sort is much better than selection or insertion, but
not as good as the default .sort() method.
- how does mergeSort's time depend on N, the size of the list?
- how many "steps" does mergeSort take to finish it's work?
Each level of recursion requires copying all N values from the
halved sub-lists back to the bigger original list. So that's
N copies * how many levels of recursion. Each level of recursion
requires splitting a given list into two smaller lists. So "how many
levels of recursion" is really just "how many times can we split
the list in two?", which we already know is log(base2)N.
So mergeSort's time = N * logN, which is better than NxN.
See /home/jk/inclass/lineplots.py for a graphical comparison
of NlogN vs logN vs linear vs quadratic.