## Monday

This week and next week we are working on top-down design. This is a useful technique for writing larger, more complex programs. Today (Monday) we will learn about file input and output, so we can use data stored in files in our programs. On Wednesday we will start learning TDD.

### Files

Motivations and possible uses:

• video game data files (read in terrain data; keep track of high scores)

• grader program: store student grade data in a file (don’t have to type it in each time the program runs)

• iTunes: how does iTunes keep track of number of plays for each song??

Files can be just text files, like you edit with atom.

#### syntax

The basic syntax for opening a file is:

`myfile = open(filename,mode)`

where `filename` is the name of a file, and `mode` is the mode used for opening: usually read ("r") or write ("w") mode. Both arguments are strings, and `myfile` is just the variable I picked to store the file object returned by the `open()` function.

Here is an example of opening a file called `poem.txt` for reading, and storing the file object in a variable called `infile`:

`infile = open("poem.txt", "r")`

#### examples

Once you have a file object, you can use the input and output methods on the object.

##### OUTPUT

Here’s how to open a file for writing (note: `myfile` is a variable name that I choose, and `"newfile"` is the name of the file to write to):

```$python3 >>> myfile = open("newfile", 'w') >>> myfile.write("write this to the file \n") >>> myfile.write("and this.... \n") >>> myfile.close()``` and here are the results of this: ```$ ls
newfile
$cat newfile write this to the file and this....``` What happens if we leave out the `\n` on each line?? ##### INPUT I have a file called `words.txt` with a few words in it: ```$ cat words.txt
happy
computer
lemon
zebra```

To open a file for reading, use 'r' mode:

`>>> infile = open("words.txt", 'r')`

File `words.txt` must exist, otherwise we get an error.

The `infile` variable, which is a `FILE` type, can be used as a sequence (e.g., in a `for` loop!):

```>>> for line in infile:
...    print line
...
happy```
`computer`
`lemon`
`zebra`

We can use the `for` loop like above, or we could use the file methods: `readline()` to read one line, `readlines()` to read them all at once.

```>>>> # need to close and reopen to get back to start of file
>>> infile.close()
>>>
>>> infile = open("words.txt", "r")
>>> print word
happy```
```>>> word = infile.readline()
>>> print word
computer```
```>>> infile.close()
>>> infile = open("words.txt", "r")
>>> print words
['happy\n', 'computer\n', 'lemon\n', 'zebra\n']```

So `readlines()` reads in EVERYTHING and puts each line into a python list. NOTE: the newline characters are still part of each line! Sometimes you want to read in EVERYTHING, all at once. Sometimes it’s better to read data in line-by-line and process each line as you go (use the `for` loop: `for line in infile`)

#### File I/O Notes:

• reading from and writing to files is usually S L O W

• for this reason, we usually read in data at the beginning of a program and store it in a list or other data structure (ie, if we need the data throughout the program, it’s much faster to refer to the list rather than the file)

• also, reading from the file is similar to watching a movie on VHS tapes — at the end of the movie, you have to rewind the tape to get back to the beginning. Once we do that `for line in infile` loop above, we are at the end of the file. You can "rewind" the file by closing and reopening it (or use the `seek()` method in python)

### str methods: strip() and split()

```$cat grades.txt saul: 93 thibault: 92.5 lauri: 100 andy: 70 jeff: 67.5 kevin: 85``` If I want to find the average of all of those grades, I need to read in each line, then somehow pull out the grade and store it. This is where the `str` methods `strip()` and `split()` can be used. Here are examples of each: ```>>> infile = open("grades.txt", "r") >>> line = infile.readline() >>> print(line) saul: 93``` ```>>> data = line.split(":") >>> print(data) ['saul', ' 93\n'] >>> grade = float(data[1]) >>> print(grade) 93.0``` So `split()` just splits a string and returns the results as a list. In the above example we split the string on a colon (":"). Here are a few more examples of `split()`. By default (with no arguments given), it splits the string on whitespace. ```>>> S = "a,b,c,d,e,f,g" >>> L = S.split(",") >>> print(L) ['a', 'b', 'c', 'd', 'e', 'f', 'g'] >>> phrase = "the quick brown fox jumped over the lazy dog" >>> words = phrase.split() >>> print(words) ['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']``` And `strip()` will strip off leading and trailing characters. Again, by default it strips off whitespace. If you provide an argument, it will strip off that: ```>>> S = " hello\n" >>> print(S) hello >>> print(S.strip()) hello >>> >>> word = "Hello!!!!!" >>> print(word.strip("!")) Hello``` #### Your turn Can you write a program to read the `grades.txt` file into a python list of grades, and then calculate the average grade? Here’s an example of what we want: ```$ python3 grader.py
[93.0, 92.5, 100.0, 70.0, 67.5, 85.0]

#### challenge

Once you have the grades in a list, can you find the highest and lowest grades?

```$python3 grader.py [93.0, 92.5, 100.0, 70.0, 67.5, 85.0] average grade: 84.7 highest grade: 100.0 lowest grade: 67.5``` ## Wednesday ### top-down design As we write bigger and more complex programs, designing them first will save time in the long run. Similar to writing an outline for a paper, using top-down design means we write out `main()` first, using functions we assume will work (and that we will write, later). We also decide on any data structures we will need. What we want to avoid is writing a whole bunch of functions and code, and then realizing we forgot something, which changes some of the functions, which changes other functions, and so on. Furthermore, we want a way to test each function as we write it. Many first-time programmers will write lots of functions and then start testing them. If each function has a few bugs in it, this makes it a lot harder to debug. A typical top-down design includes the following: • `main()` all written out • function stubs written (`def` with params, function comment, dummy `return` value) • data structures clearly defined (store data in a list? an object? a list of objects? something else?) • the design should run without any syntax errors • running the design should show how the program will eventually work Once you have the design all written out, you can now attack each function one-at-a-time. Get the function to work. Test it to make sure it works, then move on to the next function. ### TDD example Suppose we want to write this square word game: ```$ python3 squarewords.py
l|l|e
d|c|o
r|r|a

word 1? corralled
Correct! Score = 10

g|p|i
l|l|a
g|i|n

word 2?
Incorrect...word was: pillaging   Score = 0

d|i|n
g|a|c
c|o|r

word 3? according
Correct! Score = 10

s|n|i
d|a|o
o|t|n

word 4? quit```

The game uses 9-letter words, and displays them in a 3x3 box, where the letters run either vertically or horizontally, and the start of the word can be anywhere in the 3x3 box. The user’s job is to figure out each 9-letter word.

Here’s my TDD for the above program:

``````"""
squareword puzzle game as example of tdd

J. Knerr
Fall 2019
"""

from random import *

def main():
score = 0
wordnum = 0
done = False
while not done:
word = words[wordnum]
display(word)
done = True
score += 10
print("Correct! Score = %d" % (score))
else:
score -= 10
print("Incorrect...word was: %s   Score = %d" % (word,score))
wordnum += 1

def display(word):
"""display word in 3x3 grid with random start position"""
print(word)

def getInput(n):
"""get user's guess, make sure it's valid, return valid guess"""
guess = input("word %d? " % (n+1))
# should allow 9-letter word, quit, and empty string
return guess

"""read all 9-letter words from file, shuffle the order, return list"""
words = ["aaaaaaaaa","bbbbbbbbb","ccccccccc"]
return words

main()``````

Notice that `main()` is completely written! And the goal is that it shouldn’t need to change much, as I implement the remaining functions.

Also, the program runs, but doesn’t really do much yet (since I haven’t really written all of the functions). Here’s what it looks like so far:

```$python3 design-squarewords.py aaaaaaaaa word 1? 123 Incorrect...word was: aaaaaaaaa Score = -10 bbbbbbbbb word 2? bbbbbbbbb Correct! Score = 0 ccccccccc word 3? quit``` Now, since I have a working program (it runs without syntax errors!), I can attack each function separately. I want to write a function and thoroughly test it before I move on to the next function. #### write the `getInput(n)` function Can you write the `getInput(n)` function? It should keep asking the user for input until it gets a valid string: either a 9-letter word, "quit", or the empty string. ```a|r|y m|a|x i|l|l word 1? hello Please enter a 9-letter word! word 1? 123456789 Please enter a 9-letter word! word 1? wwww eeee Please enter a 9-letter word! word 1? abcdefghi Incorrect...word was: maxillary Score = -10``` ## Friday ### review of squarewords functions Here’s one way to write the `getInput(n)` function from the `squarewords.py` file: ``````def getInput(n): """get user's guess, make sure it's valid, return valid guess""" while True: guess = input("word %d? " % (n+1)) # should allow 9-letter word, quit, and empty string if guess == "" or guess == "quit": return guess elif len(guess)==9 and guess.isalpha()==True: return guess else: print("Please enter a 9 letter word!!")`````` The `while True` is just an infinite loop. I use it here to get the loop going, and not have to worry about a specific condition. Because it’s an infinite loop, I need to make sure there’s a way out. That’s what the `return guess` lines do: return the user’s guess back to `main()`, so both the `while` loop and the function are done. The only way out of this infinite loop is if we get valid input from the user. Otherwise we print the error message ("Please enter a 9 letter word!!") and loop back to the `input()` call at the top of the loop. And here’s how to read in the data from the word file (one word per line), and only select the 9-letter, lowercase, all-alphabetic words: ``````def read9(filename): """read all 9-letter words from file, shuffle the order, return list""" words = [] inf = open(filename, "r") data = inf.readlines() inf.close() # get 9-letter words from data, add to words for word in data: word = word.strip() if len(word)==9 and word.islower() and word.isalpha(): words.append(word) return words`````` Note in this one how I user `word.islower()` and not `word.islower()==True`. Either would work, but `word.islower()` is already a boolean (True or False), so I don’t need to compare it — I can just use it as is: if the word is 9 characters AND they are all lowercase AND they are all alphabetic characters (abcdefg…​.), add the word to the list of words. ### top-down-design on flashcards Here’s an example of the program I want: read in some flashcards from a file, ask the user each card, keep track of how many they get correct, print an appropriate message, ask if they want to go again. ```$ python3 flashcards.py
Flashcards file? german.txt
==============================
essen: to eat
Correct!
- -- -- -- -- -- -- -- -- -- -
Correct!
- -- -- -- -- -- -- -- -- -- -
besuchen: to visit
Correct!
- -- -- -- -- -- -- -- -- -- -
fahren: to travel
Correct!
- -- -- -- -- -- -- -- -- -- -
lieben: to jump
Nope....lieben = to love
- -- -- -- -- -- -- -- -- -- -
schlafen: to sleep
Correct!
- -- -- -- -- -- -- -- -- -- -
spielen: to run
Nope....spielen = to play
- -- -- -- -- -- -- -- -- -- -
trinken: to drink
Correct!
- -- -- -- -- -- -- -- -- -- -
verstehen: to keep
Nope....verstehen = to understand
- -- -- -- -- -- -- -- -- -- -
OK...not terrible.
Go again? (y/n) n
Bye!```

And here’s a sample flashcards data file:

```\$ cat german.txt
essen:to eat
besuchen:to visit
fahren:to travel
lieben:to love
schlafen:to sleep
spielen:to play
trinken:to drink
verstehen:to understand```

So again, the goal of the top-down design process is:

• `main()` all written out

• function stubs written (`def` with params, function comment, dummy `return` value)

• data structures clearly defined (store data in a list? an object? a list of objects? something else?)

• the design should run without any syntax errors

• running the design should show how the program will eventually work

In class we did this part together. Here’s the design after we typed it all in:

``````flashcards program

J. Knerr
Fall 2019
"""

from random import *

def main():
filename = input("Flashcards file? ")
done = False
while not done:
shuffle(cards)
ncorrect = flash(cards)
# print message
message(ncorrect, len(cards))
# ask if they want to go again
done = quit()

def quit():
"""return True if they want to quit"""
result = input("Go again? ")
if result != "y":
return True
else:
return False

def message(ncorrect, nprobs):
"""print message to user based on percent correct"""
print("Good work!")

def flash(cards):
"""given the cards, ask questions, return how many correct"""
return 3

"""open file, read in all data, return list-of-lists"""
cards = [["q1","a1"], ["q2","a2"], ["q3","a3"]]
return cards

main()``````

#### list-of-lists

Note the use of a "list of lists" in the `readFile()` function. I want to read in each card and make a list, like this: `["essen", "to eat"]`. Then I want to store all cards (which are lists) in a list. So the final data structure will look like this:

``````cards = [['essen', 'to eat'], ['kaufen', 'to buy'], ['besuchen', 'to visit'],
['fahren', 'to travel'], ['lieben', 'to love'], ['schlafen', 'to sleep'],
['spielen', 'to play'], ['trinken', 'to drink'], ['verstehen', 'to understand']]``````

For the above, what is `cards[1]`? And what is `cards[1][0]`?

```>>> cards = [['essen', 'to eat'], ['kaufen', 'to buy'], ['besuchen', 'to visit'], ...]
>>> cards[1]
>>> cards[1][0]
'kaufen'
>>> cards[1][1]

#### implement the `readFile()` function

Here’s one way to write the `readFile()` function to read in the data and return it as a list of lists.

``````def readFile(filename):
"""open file, read in all data, return list-of-lists"""
inf = open(filename,"r")