WEEK08: searching, analysis of algorithms
---------------------------------------------------------------
W: binary search

ANNOUNCEMENTS:
- Lab 8 is up and due next Tue

BINARY SEARCH:
- last time we talked about linearSearch (and wrote searches.py example)
- binarySearch is like the guessing game (guess a # from 1 to 100) but
with info on each incorrect guess (too high or too low)

- list of items must already be in order/sorted!!

- let's try an example from the computer's point of view
(ie, can't "see" all items in L at once):

we have a variable, x
we have a list of items, L, that are "in order"

algorithm:
find the middle item, compare it to x
if x == middle item in L, return middle index
elif x is < middle item in L, refocus on left half of list
elif x is > middle item in L, refocus on right half of list
repeat the above until we find x in L or we don't...

How do we find the middle item in L???

low_index = 0
high_index = len(L) - 1
middle_index = (low_index + high_index)/2

EXAMPLE:

x = 7
L = [-20, -12, -4, 1, 7, 44, 45, 46, 58, 99, 145]

index: 0    1   2  3  4   5   6  7    8   9   10

low                 mid                 high

- mid = (low + high)/2 = (0 + 10)/2 = 5

- compare x = 7 to L[5] = 44
- since 7 is less than 44, we set high = mid - 1 = 4

low = 0
high = 4

(now we focus on the list from index 0 to index 4)

- recalculate mid = (low + high)/2 = (0 + 4)/2 = 2

- compare x = 7 to L[2] = -4
- since 7 is greater than -4, set low to mid + 1

low = 3
high = 4

(now we focus on the list from index 3 to index 4)

- recalculate mid = (low + high)/2 = (3 + 4)/2 = 3
- compare x = 7 to L[3] = 1
- since 7 is greater than 1, set low to mid + 1 (again)

low = 4
high = 4

(now we focus on the list from index 4 to index 4)

- recalculate mid = (low + high)/2 = (4 + 4)/2 = 4
- compare x = 7 to L[4] = 7
- FOUND X in L, so return mid index (4)

WORKSHEET:

- copy this worksheet, which includes pseudo-code, and do the searches:

/home/jk/inclass/searchWorksheet

(Note: assume you are the computer, doing a binarySearch for x in the list L)

NOTE: comparisons work with numbers and strings!

>>> "jeff" > "frances"
True
>>> "jeff" > "Jeff"
True
>>> "jeff" > "jeffrey"
False
>>> "apple" < "zebra"
True

- code up the binarySearch(x, L) algorithm in your searches.py file
and TEST it

ANALYSIS of BINARY SEARCH:

How many "steps" does each algorithm take, if len(L) = N??

best case          worst case
---------          ----------
linear search:      1                   N
binary search:      1                   ???

For the binary search, each time through the while LOOP we split
the list in half. If N = 1000000, how many times can we split
one million in half??

>>> N = 1000000
>>> while N > 1:
...   print N
...   N = N/2
...
1000000
500000
250000
125000
62500
31250
15625
7812
3906
1953
976
488
244
122
61
30
15
7
3

If you count that up, it's about 19 times.

Here's the mathy way to say that: log-base-2 of N

>>> from math import *
>>> log(1000000, 2)
19.931568569324174

If I double the amount of data we are searching through (N = 2000000),
now how many steps does binary search require (in the worst case)???

>>> log(2000000, 2)
20.931568569324174

JUST 1 more step than the N=1000000 case!!!

How many "steps" does each algorithm take, if len(L) = N??

best case          worst case
---------          ----------
linear search:      1                   N
binary search:      1                   log(N)