- Search motivation
- Linear search
- Complexity of linear search
- Binary search
- Complexity of binary search

Computer Science as a discipline is focused on two primary questions:

- What types of problems can be solved computationally?
- How
*efficiently*can these problems be solved?

One of the core problems we computers are asked to solved is the *search* problem. Broadly speaking, the search problem searches for a *query* item in a potentially large set of potential matches. For two large internet companies, search is one part of their core business model.

Searching efficiently can help you solve larger problems, help more customers, or make larger profits. So how do we organize data and write code/algorithms to search efficiently? And what do we even mean by efficient?

Consider the two programs `guess1.py`

and `guess2.py`

in your `w09-search`

folder. Each program plays a guessing game where the computer chooses a secret number between 1 and 100 and asks the user to guess the secret number. Both count the number of guesses until you are correct. But the programs differ slightly in how they give you feedback when your guess is wrong. Try playing both games. Is one easier than the other? Do you have a different strategy for one game than the other? Why?

To explore the complexity of search, we will narrow the problem to searching for an item `x`

in a python list of items. Python already has two ways of doing this. The first is the Boolean `in`

operator, `x in ls`

which returns `True`

if `x`

is in the list `ls`

and `False`

otherwise. Python also supports the `index()`

method which will tell you the position in the list of the first occurrence of `x`

, provided `x`

appears in the list. So `ls.index(x)`

will return an integer *position* if `x`

is in `ls`

. If `x`

is not in the list, Python will generate an error called an *exception* that will likely crash your program since we have not talked about how to handle exceptions in this class.

But how do these methods actually work? At some point, a computer scientist and python programmer designed and wrote the code for these built-in features. We will discuss the algorithm for searching a collection of items. Together, we'll write a function that does something familiar: search through a list for an item without using the `in`

operator. Our functions will be called `contains`

and `positionOf`

How do we analyze how long it takes for search to finish? We will count steps to analyze a programs complexity. What counts as a step? What happens in the best case? Worst case? How many steps does linear search take in the worst case?

A key algorithmic question is: can we do better? In somes cases, we can't (and we can prove this!). In the case of linear search, we cannot do better in the general case. However, if all items in the collection are in sorted order, we can perform a faster algorithm known as *binary search*.

Binary search works by using divide and conquer. Each step of the algorithm divides the number of items to search in half until it finds the value or has no items left to search. Here is some pseudocode for the algorithm:

```
set low = lowest-possible index
set high = highest possible index
LOOP:
calculate middle index = (low + high)/2
if item is at middle index, we're done (found it! return True)
elif item is < middle item,
set high to middle - 1
elif item is > middle item,
set low to middle + 1
if low is ever greater than high, item not here (done, return False)
```