Project 4: Classification
Due Tuesday, Dec. 3

On the left, an image of a person wearing sunglasses. On the right, an image representing the weights of a hidden unit in a neural network trained to classify such images.

Introduction

This project gives you the chance to formulate a machine learning problem of your choice and use a neural network to solve it. You will focus on a data set containing images of people's faces.

First you will explore using neural networks to solve some of the simpler problems we discussed in class such as the logic problems AND, OR, and XOR. Next you will look at the auto-encoder problem where the network develops a binary-like representation in its hidden layer. Then you will experiment with the handwritten digit recognizer. Finally you will focus on the image data.

After a thorough examination of neural networks, you can then choose to explore a completely different machine learning technique and apply it to a problem of your choice.

Run update63 to get a copy of this week's files.

Files and directories that will be graded:
xor.txt Your explanation for how a three-layer network solves XOR
encoder.txt Your description of the hidden layer representation created by the auto-encoder network
digits.txt Your description of the hidden layer features created by the digit classifying network
images.txt Your description of the image classification task you chose
myML A directory containing all of the relevant files for Question 5
Files you'll use:
and-net.py Creates a two-layer network to solve AND
or-net.py Creates a two-layer network to solve OR
xor-net.py Creates a two-layer network that is unable to solve XOR
xor-3layer.py Creates a three-layer network to solve XOR
inputs.dat Contains the four input patterns for all of the logic problems
and-targets.dat Contains the four target patterns for AND
or-targets.dat Contains the four target patterns for OR
xor-targets.dat Contains the four target patterns for XOR
8bit-net.py Creates a three-layer auto-encoder network
8bit.dat Contains the eight input and target patterns for the auto-encoder problem
digit-net.py Creates a three-layer network to classify handwritten digits
digit-inputs.dat Contains the 100 input patterns for the digits classifier
digit-targets.dat Contains the 100 target patterns for the digits classifier
glasses-net.py Creates a three-layer network to classify images of people
sunglassesfiles File names of the images to be used to create the glasses training set
glasses-inputs.dat Contains the 40 input patterns for the glasses classifier
glasses-targets.dat Contains the 40 target patterns for the glasses classifier
processingFaces.py A collection of functions to convert the raw image data into a form usable by a neural network
Supporting files you can read, but should not modify:
conx.py Implements neural networks
newConx.py Enhances the original conx code by providing visualization

Evaluation

There is no autograder for this project.

In this project there are five required questions worth a possible 20 points.

This project is due by 11:59pm Tuesday, Dec. 3. Run handin63 to turn in your solution.

Using the neural network library

In many of the examples below the network will be called n. In the Python interpreter, you can use the following methods:

Solving logic problems

See how neural networks perform on the logic problems AND, OR, and XOR.

Create a simple network to solve AND:

python -i and-net.py
It has just two inputs and one output, along with a bias. Before training, test the AND network's performance and look at its weights. Then train the network and re-test its performance and check out how the weights have changed. Do the weights make sense to you?

Create a simple network to solve OR:

python -i or-net.py
It has the same structure as the previous network. Try all of the same commands as before. Convince yourself that the weights make sense.

Create a simple network to try to solve XOR:

python -i xor-net.py
Again it has the same structure as the previous two examples. When you train this network it will be unable to learn. You can do a CTRL-C to interrupt the training.

Create a more complex network that can solve XOR:

python -i xor-3layer.py
In this case the network has three layers (input, hidden, and output) instead of just two (input, output). To see all of the weights for this network requires two commands:
Question 1 (2 points) After training this three-layer XOR network, use the above commands to inspect the final learned parameters. Edit the diagram of the network in the file xor.txt to show all of the trained weights and biases. Then explain how the network has solved the problem based on them.

Analyzing hidden layer representations

Re-read section 4.6.4 starting on page 26 in Tom Mitchell's Chapter 4 about hidden layer representations.

Create a network that learns to take eight-bit patterns and reproduce them on the output layer after re-coding them in a three-unit hidden layer:

python -i 8bit-net.py
After training, use the showPerformance() method to inspect the hidden layer representations.

Question 2 (2 points) In the file encoder.txt write down each of the hidden layer repersentations created by the network. Then convert them into bit patterns by rounding each hidden unit activation to either 0 or 1. Has the network re-coded them using a binary-like representation?

Classifying handwritten digits

Create a network that learns to classify handwritten digits:

python -i digit-net.py
This network takes 8x8 images as input, passes them through a 5-unit hidden layer, and classifies them in a 10-unit output layer. Each unit in the output layer is associated with one digit (0-9). For example, if the last unit is highly active, this means that the network is classifying the current image as the digit 9.

When you run this file it will open up additional windows so that you can visualize the input, hidden, and output activations as well as the hidden weights. Move the windows around so that you can see all of them. Train the network to classify all of the patterns.

Question 3 (3 points) Test all of patterns and record your observations in the table provided in the file digits.txt. Look at the hidden layer weights in the displays. Describe how the network learned to recognize each type of digit based on the hidden layer features it has discovered.

Classifying images

We will be using the same types of images that were discussed in this week's reading. Re-read section 4.7 from Tom Mitchell's Chapter 4 (pages 32-36).

The directory /home/meeden/public/cs63/faces_4/ contains 624 images stored in PGM format. You can view one of these images using the xv command. Each file is named according to the following convention:

userid_pose_expression_eyes_scale.pgm

Using a neural network to learn a classification task involves the following steps: determining a task, gathering appropriate data, creating the training set of inputs and targets, creating a network with the appropriate topology and parameter settings, training the network, and finally analyzing the results. Each of these steps is explained in more detail below. After you have tried these steps on the sunglasses example, you will repeat this process on a task of your choosing.

  1. Choose a task
    The first step is deciding what classification task you would like to learn. There are a number of possibilities such as presence of sunglasses, head position, emotion, or identifying particular individuals. Some of these classification tasks will be easier to learn than others. As an example, let's focus on the relatively easy task of determining whether or not a person is wearing sunglasses.
  2. Gather data
    Next we need to select which of the images to use in our training set. As a starting point, let's focus on images where the person is looking straight ahead and has a neutral expression. We need to gather together all of the filenames that meet these criteria. We can use ls with the -C1 flag to ensure that the output from the listing is in one column. We can choose appropropriate files with the wildcard *, and then use the greater than sign to save those image file names into a file in our own directory:
    cd /home/meeden/public/cs63/faces_4
    ls -C1 *straight*neutral* > ~/cs63/projects/4/sunglassesfiles
    
    Using the unix command wc ~/cs63/projects/4/sunglassesfiles we can see that this file has 40 lines in it. We may want a slightly larger training set. So let's add in all the images where a person is looking straight ahead and has a happy expression. We can do this by again using ls to select these images and using two greater than signs to append these additional names to the end of the same file:
    ls -C1 *straight*happy* >> ~/cs63/projects/4/sunglassesfiles
    
    Using the wc command again we see that we now have 79 images.

  3. Create training set
    Once we have selected a good set of images, we need to convert them into a format that is appropriate for the neural network. The PGM files have pixel values between 0 and 255; we need to normalize these values between 0 and 1. In addition, we need to create input files where each normalized image is written one per line. I have written some python functions in the file processingFaces.py to help you prepare these image files:
  4. Execute this file by doing:

    python processingFaces.py
    It will create two files called glasses-inputs.dat and glasses-targets.dat.

  5. Set up network
    Once the training data has been prepared we can create the neural network. I have provided you with example code in the file glasses-net.py to do this. This creates a new class called SunglassesRecognizer that inherits from the class BackpropNetwork. It adds two additional methods classify and evaluate. At the bottom of the file, it sets up the network:

    Execute this file by doing:

    python -i glasses-net.py
    This will produce a number of windows. Move them around so that you can see them all.

  6. Train network
    One of the advantages of using neural networks is their generalization ability. We want to train the network using the training set, but we don't want the network to memorize the training set. We want the trained network to be able to respond appropriately to novel input. We need to be careful not to overtrain the network. Thus the goal is not to achieve 100 percent correctness. We will train the network for a number of epochs and then test the network on novel data to monitor how well it is generalizing. We will repeat this process until the network is no longer improving its performance on the novel data. For example, try the following commands in the python interpreter after you have set up the network:
    >>> n.train(5)
    Epoch #     1 | TSS Error: 15.3588 | Correct: 0.0000
    Epoch #     2 | TSS Error: 15.2270 | Correct: 0.0000
    Epoch #     3 | TSS Error: 15.2524 | Correct: 0.0000
    Epoch #     4 | TSS Error: 15.1156 | Correct: 0.0000
    Epoch #     5 | TSS Error: 14.8360 | Correct: 0.0000
    Reset limit reached; ending without reaching goal
    ----------------------------------------------------
    Final #     5 | TSS Error: 14.8360 | Correct: 0.0000
    ----------------------------------------------------
    
    You can see that error is dropping, but so far the network has not learned to respond correctly to any of the training patterns. NOTE: because each network is initialized with different random weights your training results will vary. Let's check on how well the network is doing on the test patterns.
    >>> n.swapData()
    Swapping training and testing sets...
    19 training patterns, 60 test patterns
    >>> n.evaluate()
    network classified image #0 (sunglasses) as ???
    network classified image #1 (sunglasses) as ???
    network classified image #2 (eyes) as ???
    network classified image #3 (sunglasses) as ???
    network classified image #4 (eyes) as ???
    network classified image #5 (eyes) as ???
    network classified image #6 (eyes) as ???
    network classified image #7 (eyes) as ???
    network classified image #8 (eyes) as ???
    network classified image #9 (eyes) as ???
    network classified image #10 (sunglasses) as ???
    network classified image #11 (sunglasses) as ???
    network classified image #12 (eyes) as ???
    network classified image #13 (eyes) as ???
    network classified image #14 (eyes) as ???
    network classified image #15 (eyes) as ???
    network classified image #16 (eyes) as ???
    network classified image #17 (sunglasses) as ???
    network classified image #18 (eyes) as ???
    19 patterns: 0 correct (0.0%), 19 wrong (100.0%)
    
    As expected, it is not responding correctly to these either. Let's reset the data, continue training, and then re-test.
    >>> n.swapData()
    Swapping training and testing sets...
    60 training patterns, 19 test patterns
    >>> n.train(5)
    Epoch #     1 | TSS Error: 14.2245 | Correct: 0.0000
    Epoch #     2 | TSS Error: 12.9923 | Correct: 0.0000
    Epoch #     3 | TSS Error: 10.3263 | Correct: 0.0000
    Epoch #     4 | TSS Error: 7.8044 | Correct: 0.0833
    Epoch #     5 | TSS Error: 5.8430 | Correct: 0.3500
    Reset limit reached; ending without reaching goal
    ----------------------------------------------------
    Final #     5 | TSS Error: 5.8430 | Correct: 0.3500
    ----------------------------------------------------
    >>> n.swapData()
    Swapping training and testing sets...
    19 training patterns, 60 test patterns
    >>> n.evaluate()
    network classified image #2 (eyes) as ???
    network classified image #4 (eyes) as ???
    network classified image #5 (eyes) as ???
    network classified image #6 (eyes) as ???
    network classified image #7 (eyes) as ???
    network classified image #8 (eyes) as ???
    network classified image #9 (eyes) as ???
    network classified image #10 (sunglasses) as ???
    network classified image #12 (eyes) as ???
    network classified image #13 (eyes) as ???
    network classified image #14 (eyes) as ???
    network classified image #15 (eyes) as ???
    network classified image #16 (eyes) as ???
    network classified image #18 (eyes) as ???
    19 patterns: 5 correct (26.3%), 14 wrong (73.7%)
    
    Clearly the performance of the network is improving. After several more iterations of this process of reseting the data and additional training, the network will start performing well on both the training set and the testing set and learning can be stopped. Remember the goal is to achieve good performance on the training set while still being able to respond appropriately to the novel data in the testing set.
  7. Analyze results
    Once the training process is complete, we can look more closely at the activations and weights to try to understand how the network has solved the task. In the python interpreter do n.showPerformance() and observe for which images each unit is active. How are the hidden units coding for the different categories of images? For the sunglasses task there is only one hidden unit, so this is pretty straight forward. Next look at the hidden weights. What parts of the image are the hidden units focusing on? Why are these locations of the image important for the given task?

Question 4 (5 points) After you have successfully tried the sunglasses example, repeat this process (Steps 1-6 above) on a task of your choice using some subset of the faces image data. In the file images.txt explain in detail what you did in each step. Also, take a screen shot of your resulting hidden weights and save them in a file called hiddens.png so that you can refer to them in your description. Creating a successful classifier often requires an iterative process. You may find that you need more hidden units. You may find that you need more training examples. Some problems you'd like to try (such as recognizing the emotions in the images) may be quite difficult to solve.

Machine Learning Application

Question 5 (8 points) For this question you can explore something of particular interest to you related to machine learning. Be sure to put all of the relevant files in the directory myML. Explain how to run your code in the README file. Here are some suggestions, but I am open to other options, just ok it with me first:

Remember to run handin63 by 11:59pm on Tuesday, Dec. 3. Also be sure to select your partner.