Class Notes Week 9

Week 9 Topics

Michigan making it to the Final Four
Nested for loops
Linear search
Complexity of linear search
Big-O notation
Binary search
Complexity of binary search

A-Maize-ing

Nested for loops

We have seen that loops can contain other code blocks, including if-statements and other for loops. A for loop inside of a for loop is called a nested for loop. Here is a simple example, that prints out a times table for integers up to 5:

for i in range(1,6):
  for j in range(1,6):
    print(i,j,i*j)

which produces the following (first two columns are the numbers being multiplied and the last column is the result):

Exercise

Modify the nested for loop to omit repeat calculations (e.g., only do one of 1*5 and 5*1). Here is one version of the output:

These can be useful in many circumstances, but in particular for navigating through list-of-lists. For example, we may want to average all quiz scores in our gradebook example last week, or find all scores below 60%.

Exercise: sumLoL.py

Open up sumLoL.py:

$ cd ~/cs21/inclass/w09-search
$ atom sumLol.py

Complete the function such that it will sum the entire contents of a list of lists.

Linear Search

We have used Python functions for searching (e.g., in operator, list.index()). How do these methods actually work? We will discuss the algorithm for searching a collection of items. Together, we’ll write a function that does something familiar: search through a list for an item without using the in operator.

Complexity of Search

How do we analyze how long it takes for search to finish? We will count steps to analyze a programs complexity. What counts as a step? What happens in the best case? Worst case? How many steps does linear search take in the worst case?

Binary Search

A key algorithmic question is: can we do better? In somes cases, we can’t (and we can prove this!). In the case of linear search, we cannot do better in the general case. However, if all items in the collection are in sorted order, we can perform a faster algorithm known as binary search.

Binary search works by using divide and conquer. Each step of the algorithm divides the number of items to search in half until it finds the value or has no items left to search. Here is some pseudocode for the algorithm:

set low = lowest-possible index
set high = highest possible index
LOOP:
    calculate middle index = (low + high)/2
    if item is at middle index, we're done (found it! return True)
    elif item is < middle item,
      set high to middle - 1
    elif item is > middle item,
      set low to middle + 1
if low is ever greater than high, item not here (done, return False)

In testsearches.py, we will implement the binary search algorithm to search a list for some random value.

Exercise: Binary Search Worksheet

Using the handed out worksheet, work through binary search on a variety of search problems. Keep track of the values of low, high, and mid along the way and note the value that is returned. If you do not receive a handout, here are the search problems:

x = 99
L = [-20, -12, -4, 1, 7, 44, 45, 46, 58, 67, 99, 145]
index: 0    1   2  3  4   5   6   7   8   9  10   11

x = -10
L = [-20, -12, -4, 1, 7, 44, 45, 46, 58, 67, 99, 145]     
index: 0    1   2  3  4   5   6   7   8   9  10   11

x = "amanda"
L = ['andy','betsy','bridget','charles','charlie','jeff','lisa','rich','tia']
index: 0       1        2         3         4        5      6      7     8

x = "rich"
L = ['andy','betsy','bridget','charles','charlie','jeff','lisa','rich','tia']
index: 0       1        2         3         4        5      6      7     8

Analysis of Binary Search

Why is it called binary search? How many steps does it take for binary search to run? We will see that binary search is O(logN). What does this mean relative to linear search? Run testsearches.py with different values of N to see how the run time of binary search grows with increasing list sizes. How does this compare to linear search’s growth?