WEEK08: searching, analysis of algorithms --------------------------------------------------------------- F: timing, real-world example...anagrams PYLAB: - copy over pylabLinePlots.py and try it out - http://matplotlib.sourceforge.net (good data plotting from python) - note how logn is better than linear for large N (we'll look at the nlogn and quad lines next week) SHUFFLE and SORT: >>> from random import * >>> L = range(10) >>> shuffle(L) >>> print L [4, 2, 1, 5, 8, 7, 0, 3, 6, 9] >>> >>> L.sort() >>> print L [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] sort() is a list method, shuffle is a function from the random library. Both do their work "in place", meaning the list (L) is changed. TIMING SEARCHES: - write some code to show how the time for a linear search doubles when we double the size of the data (N) - also, how does the "in" operator work? Does it do a linear search or a binary search?? from searches import * from random import * from time import time ############################################## def main(): """ask user for size of data, run searches""" N = input("size of data, N: ") T = input(" num trials, T: ") L = [] for i in range(N): L.append(randrange(-N,N)) L.sort() lintotal = 0 bintotal = 0 intotal = 0 for t in range(T): x = randrange(-N,N) t1 = time() linearSearch(x, L) t2 = time() lintotal = lintotal + (t2 - t1) t1 = time() binarySearch(x, L) t2 = time() bintotal = bintotal + (t2 - t1) t1 = time() x in L t2 = time() intotal = intotal + (t2 - t1) print "ave linear time: %0.7f" % (lintotal/float(T)) print "ave binary time: %0.7f" % (bintotal/float(T)) print "ave in time: %0.7f" % (intotal/float(T)) ############################################## main() $ python timesearches.py size of data, N: 200000 num trials, T: 20 ave linear time: 0.0186138 ave binary time: 0.0000101 ave in time: 0.0085083 $ python timesearches.py size of data, N: 400000 num trials, T: 20 ave linear time: 0.0469730 ave binary time: 0.0000128 <--- does not double! ave in time: 0.0235038 ANAGRAMS: - copy /home/jk/inclass/anagrams.py and run the code: $ python anagrams.py -- Welcome to A N A G R A M -- Enter a word and I'll find the anagrams (QUIT to quit): miles -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- miles limes smile slime -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- - look at the code to see how it works. You should be able to understand all but the Anagram function, which uses recursion (we'll cover that next week) - the Anagram function just creates a list of all possible permutations of the given word. If you type in cat, Anagram creates the following list: ['cat', 'act', 'atc', 'cta', 'tca', 'tac'] - this list of all permutations is sent to displayAnagram which only prints out the real English words. - what happens when I type in an 8-letter word??? - why does it run so slow??? - for a 3-letter word there are 6 permutations. How many are there for a 4-letter word? How about a 5-letter word?? YOUR TURN: - change the displayAnagram function to use a binary search, instead of this linear search: if w in english and w not in uniqalist: make that something like this: if binarySearch(w,english) and w not in uniqalist: so your binarySearch function needs to return a True if the word is found, and a False if not found. RESULTS: - how many anagrams does the word "education" have? - how long does it take to figure that out with binarysearchanagrams.py vs. anagrams.py??