file IO

motivation

Motivations and possible uses:

Files can be just text files, like you edit with vim.

syntax

The basic syntax for opening a file is:

myfile = open(filename,mode)

where filename is the name of a file, and mode is the mode used for opening: usually read ("r") or write ("w") mode. Both arguments are strings, and myfile is just the variable I picked to store the file object returned by the open() function.

Here is an example of opening a file called poem.txt for reading, and storing the file object in a variable called infile:

infile = open("poem.txt", "r")

examples

Once you have a file object, you can use the input and output methods on the object.

OUTPUT

Here's how to open a file for writing (note: myfile is a variable name that I choose, and "newfile" is the name of the file to write to):

$ python
>>> myfile = open("newfile", 'w')
>>> type(myfile)
<type 'file'>
>>> myfile.write("write this to the file \n")
>>> myfile.write("and this.... \n")
>>> myfile.close()

and here are the results of this:

$ ls
newfile
$ cat newfile 
write this to the file 
and this....

What happens if we leave out the \n on each line??

INPUT

I have a file called words.txt with a few words in it:

$ cat words.txt 
happy
computer
lemon
zebra

To open a file for reading, use 'r' mode:

>>> infile = open("words.txt", 'r')

File words.txt must exist, otherwise we get an error. Also note: infile is a variable with a file type stored in it:

>>> type(infile)
<type 'file'>

And it can be used as a sequence (in a for loop!):

>>> for line in infile:
...    print line 
... 
happy

computer

lemon

zebra

We can use the for loop like above, or we could use the file methods: readline() to read one line, readlines() to read them all at once.

>>>> # need to close and reopen to get back to start of file
>>> infile.close()      
>>> 
>>> infile = open("words.txt", "r")
>>> word = infile.readline()
>>> print word
happy

>>> word = infile.readline()
>>> print word
computer

>>> infile.close()
>>> infile = open("words.txt", "r")
>>> words = infile.readlines()
>>> print words
['happy\n', 'computer\n', 'lemon\n', 'zebra\n']

So readlines() reads in EVERYTHING and puts each line into a python list. NOTE: the newline characters are still part of each line! Sometimes you want to read in EVERYTHING, all at once. Sometimes it's better to read data in line-by-line and process each line as you go (use the for loop: for line in infile)

File I/O Notes:

Another Example:

Suppose we have a file of usernames and grades, like this:

$ cat grades.txt 
lisa     :95
jeff     :35
jason    :88
adam     :97
frances  :96
rich     :77

Here's a program to figure out the average of those grades:

# store grades in python list, in case we need them later
grades = []

# read all grades into a list
gfile = open("grades.txt", "r")
for line in gfile:
  name, grade = line.split(":")   # split in two, based on colon
  grades.append(float(grade))
gfile.close()

# find ave of grades in list
total = 0.0
for g in grades:
  total += g
ave = total/len(grades)
print "ave grade = ", ave

challenge

How would you read in multiple quiz grades for each student, finding the average quiz grade for each student?

$ cat quizgrades.txt 
lisa    :98,100,95,93,99
jeff    :58,50,55,53,59
rich    :88,78,89,75,79
frances :67,78,89,90,99

$ python quizgrades.py 

      lisa:  97.0
      jeff:  55.0
      rich:  81.8
   frances:  84.6

CS21 Topics