Project 4: Classification

Project 4: Classification
Due Tuesday, Dec. 3

On the left, an image of a person wearing sunglasses. On the right, an image representing the weights of a hidden unit in a neural network trained to classify such images.

Introduction

This project gives you the chance to formulate a machine learning problem of your choice and use a neural network to solve it. You will focus on a data set containing images of people's faces.

First you will explore using neural networks to solve some of the simpler problems we discussed in class such as the logic problems AND, OR, and XOR. Next you will look at the auto-encoder problem where the network develops a binary-like representation in its hidden layer. Then you will experiment with the handwritten digit recognizer. Finally you will focus on the image data.

After a thorough examination of neural networks, you can then choose to explore a completely different machine learning technique and apply it to a problem of your choice.

Run update63 to get a copy of this week's files.

Files and directories that will be graded:
`xor.txt`	Your explanation for how a three-layer network solves XOR
`encoder.txt`	Your description of the hidden layer representation created by the auto-encoder network
`digits.txt`	Your description of the hidden layer features created by the digit classifying network
`images.txt`	Your description of the image classification task you chose
`myML`	A directory containing all of the relevant files for Question 5
Files you'll use:
`and-net.py`	Creates a two-layer network to solve AND
`or-net.py`	Creates a two-layer network to solve OR
`xor-net.py`	Creates a two-layer network that is unable to solve XOR
`xor-3layer.py`	Creates a three-layer network to solve XOR
`inputs.dat`	Contains the four input patterns for all of the logic problems
`and-targets.dat`	Contains the four target patterns for AND
`or-targets.dat`	Contains the four target patterns for OR
`xor-targets.dat`	Contains the four target patterns for XOR
`8bit-net.py`	Creates a three-layer auto-encoder network
`8bit.dat`	Contains the eight input and target patterns for the auto-encoder problem
`digit-net.py`	Creates a three-layer network to classify handwritten digits
`digit-inputs.dat`	Contains the 100 input patterns for the digits classifier
`digit-targets.dat`	Contains the 100 target patterns for the digits classifier
`glasses-net.py`	Creates a three-layer network to classify images of people
`sunglassesfiles`	File names of the images to be used to create the glasses training set
`glasses-inputs.dat`	Contains the 40 input patterns for the glasses classifier
`glasses-targets.dat`	Contains the 40 target patterns for the glasses classifier
`processingFaces.py`	A collection of functions to convert the raw image data into a form usable by a neural network
Supporting files you can read, but should not modify:
`conx.py`	Implements neural networks
`newConx.py`	Enhances the original conx code by providing visualization

Evaluation

There is no autograder for this project.

In this project there are five required questions worth a possible 20 points.

This project is due by 11:59pm Tuesday, Dec. 3. Run handin63 to turn in your solution.

Using the neural network library

In many of the examples below the network will be called n. In the Python interpreter, you can use the following methods:

n.showPerformance() will display how the network responds to each of the training patterns. Initially it will get every pattern wrong since the weights are randomly initialized and no learning has taken place.
n.printWeights(layer1, layer2) will display the network's current weights between layer1 and layer2. For simple two-layer networks, the layers are called 'input' and 'output'. For three-layer networks, the middle layer is called 'hidden'.
n.train() will repeatedly train the network on the set of training patterns. Each time through all of the patterns is called an epoch. When the network is successfully learning, the total amount of error per epoch should decrease over time.

Solving logic problems

See how neural networks perform on the logic problems AND, OR, and XOR.

Create a simple network to solve AND:

python -i and-net.py

It has just two inputs and one output, along with a bias. Before training, test the AND network's performance and look at its weights. Then train the network and re-test its performance and check out how the weights have changed. Do the weights make sense to you?

Create a simple network to solve OR:

python -i or-net.py

It has the same structure as the previous network. Try all of the same commands as before. Convince yourself that the weights make sense.

Create a simple network to try to solve XOR:

python -i xor-net.py

Again it has the same structure as the previous two examples. When you train this network it will be unable to learn. You can do a CTRL-C to interrupt the training.

Create a more complex network that can solve XOR:

python -i xor-3layer.py

In this case the network has three layers (input, hidden, and output) instead of just two (input, output). To see all of the weights for this network requires two commands:

n.printWeights('input','hidden')
n.printWeights('hidden','output')

Question 1 (2 points) After training this three-layer XOR network, use the above commands to inspect the final learned parameters. Edit the diagram of the network in the file xor.txt to show all of the trained weights and biases. Then explain how the network has solved the problem based on them.

Analyzing hidden layer representations

Re-read section 4.6.4 starting on page 26 in Tom Mitchell's Chapter 4 about hidden layer representations.

Create a network that learns to take eight-bit patterns and reproduce them on the output layer after re-coding them in a three-unit hidden layer:

python -i 8bit-net.py

After training, use the showPerformance() method to inspect the hidden layer representations.

Question 2 (2 points) In the file encoder.txt write down each of the hidden layer repersentations created by the network. Then convert them into bit patterns by rounding each hidden unit activation to either 0 or 1. Has the network re-coded them using a binary-like representation?

Classifying handwritten digits

Create a network that learns to classify handwritten digits:

python -i digit-net.py

This network takes 8x8 images as input, passes them through a 5-unit hidden layer, and classifies them in a 10-unit output layer. Each unit in the output layer is associated with one digit (0-9). For example, if the last unit is highly active, this means that the network is classifying the current image as the digit 9.

When you run this file it will open up additional windows so that you can visualize the input, hidden, and output activations as well as the hidden weights. Move the windows around so that you can see all of them. Train the network to classify all of the patterns.

Question 3 (3 points) Test all of patterns and record your observations in the table provided in the file digits.txt. Look at the hidden layer weights in the displays. Describe how the network learned to recognize each type of digit based on the hidden layer features it has discovered.

Classifying images

We will be using the same types of images that were discussed in this week's reading. Re-read section 4.7 from Tom Mitchell's Chapter 4 (pages 32-36).

The directory /home/meeden/public/cs63/faces_4/ contains 624 images stored in PGM format. You can view one of these images using the xv command. Each file is named according to the following convention:

userid_pose_expression_eyes_scale.pgm

userid is the user id of the person in the image. This field has 20 values: ani2, at33, boland, bpm, ch4f, cheyer, choon, danieln, glickman, karyadi, kawamura, kk49, megak, mitchell, night, phoebe, saavik, steffi, sz24, and tammo.
pose is the head position of the person, and this field has 4 values: straight, left, right, and up.
expression is the facial expression of the person, and this field has 4 values: neutral, happy, sad, and angry.
eyes: is the eye state of the person, and this field has 2 values: open and sunglasses.
scale: is the scale of the image. All of the images are of type 4 which indicates a quarter-resolution image (32 x 30).

Using a neural network to learn a classification task involves the following steps: determining a task, gathering appropriate data, creating the training set of inputs and targets, creating a network with the appropriate topology and parameter settings, training the network, and finally analyzing the results. Each of these steps is explained in more detail below. After you have tried these steps on the sunglasses example, you will repeat this process on a task of your choosing.

Choose a task
The first step is deciding what classification task you would like to learn. There are a number of possibilities such as presence of sunglasses, head position, emotion, or identifying particular individuals. Some of these classification tasks will be easier to learn than others. As an example, let's focus on the relatively easy task of determining whether or not a person is wearing sunglasses.

Gather data
Next we need to select which of the images to use in our training set. As a starting point, let's focus on images where the person is looking straight ahead and has a neutral expression. We need to gather together all of the filenames that meet these criteria. We can use ls with the -C1 flag to ensure that the output from the listing is in one column. We can choose appropropriate files with the wildcard *, and then use the greater than sign to save those image file names into a file in our own directory:
```
cd /home/meeden/public/cs63/faces_4
ls -C1 *straight*neutral* > ~/cs63/projects/4/sunglassesfiles
```
Using the unix command wc ~/cs63/projects/4/sunglassesfiles we can see that this file has 40 lines in it. We may want a slightly larger training set. So let's add in all the images where a person is looking straight ahead and has a happy expression. We can do this by again using ls to select these images and using two greater than signs to append these additional names to the end of the same file:
```
ls -C1 *straight*happy* >> ~/cs63/projects/4/sunglassesfiles
```
Using the wc command again we see that we now have 79 images.
Create training set
Once we have selected a good set of images, we need to convert them into a format that is appropriate for the neural network. The PGM files have pixel values between 0 and 255; we need to normalize these values between 0 and 1. In addition, we need to create input files where each normalized image is written one per line. I have written some python functions in the file processingFaces.py to help you prepare these image files:
- The function getImages takes a directory name where the images are stored, a filename containing the names of image files in that directory, and a filename to put the normalized image values.
- The function getTargets takes a filename containing the names of image files, and a filename to put the target values for the task. It uses some aspect of the filename to determine an appropriate target value. In the sunglasses example, if it finds the word 'sunglasses' in the filename, then the target value should be 1. Otherwise the target value should be 0.

Execute this file by doing:

python processingFaces.py

glasses-inputs.dat

glasses-targets.dat

Set up network
Once the training data has been prepared we can create the neural network. I have provided you with example code in the file glasses-net.py to do this. This creates a new class called SunglassesRecognizer that inherits from the class BackpropNetwork. It adds two additional methods classify and evaluate. At the bottom of the file, it sets up the network:
- First it creates an instance of the class. Then it creates a three-layer neural network with 960 inputs (to represent the 32 by 30 pixel values in each image), 1 hidden unit, and one output unit. For harder tasks you will likely need more hidden units. For tasks with more categories, you will also need an output unit for each category.
- Then it sets the learning parameters (epsilon, momentum, and tolerance).
- Next it creates windows to show the activations of all of the layers in the network, as well windows to display all of the weights from the input to the hidden layer.
- Finally, it randomly splits the data into a training set and a test set. Typically you will want to use about 75-85 percent of the data for training, and the remainder for testing.
Execute this file by doing:
```
python -i glasses-net.py
```
This will produce a number of windows. Move them around so that you can see them all.

Train network
One of the advantages of using neural networks is their generalization ability. We want to train the network using the training set, but we don't want the network to memorize the training set. We want the trained network to be able to respond appropriately to novel input. We need to be careful not to overtrain the network. Thus the goal is not to achieve 100 percent correctness. We will train the network for a number of epochs and then test the network on novel data to monitor how well it is generalizing. We will repeat this process until the network is no longer improving its performance on the novel data. For example, try the following commands in the python interpreter after you have set up the network:

>>> n.train(5)
Epoch #     1 | TSS Error: 15.3588 | Correct: 0.0000
Epoch #     2 | TSS Error: 15.2270 | Correct: 0.0000
Epoch #     3 | TSS Error: 15.2524 | Correct: 0.0000
Epoch #     4 | TSS Error: 15.1156 | Correct: 0.0000
Epoch #     5 | TSS Error: 14.8360 | Correct: 0.0000
Reset limit reached; ending without reaching goal
----------------------------------------------------
Final #     5 | TSS Error: 14.8360 | Correct: 0.0000
----------------------------------------------------

You can see that error is dropping, but so far the network has not learned to respond correctly to any of the training patterns. NOTE: because each network is initialized with different random weights your training results will vary. Let's check on how well the network is doing on the test patterns.

>>> n.swapData()
Swapping training and testing sets...
19 training patterns, 60 test patterns
>>> n.evaluate()
network classified image #0 (sunglasses) as ???
network classified image #1 (sunglasses) as ???
network classified image #2 (eyes) as ???
network classified image #3 (sunglasses) as ???
network classified image #4 (eyes) as ???
network classified image #5 (eyes) as ???
network classified image #6 (eyes) as ???
network classified image #7 (eyes) as ???
network classified image #8 (eyes) as ???
network classified image #9 (eyes) as ???
network classified image #10 (sunglasses) as ???
network classified image #11 (sunglasses) as ???
network classified image #12 (eyes) as ???
network classified image #13 (eyes) as ???
network classified image #14 (eyes) as ???
network classified image #15 (eyes) as ???
network classified image #16 (eyes) as ???
network classified image #17 (sunglasses) as ???
network classified image #18 (eyes) as ???
19 patterns: 0 correct (0.0%), 19 wrong (100.0%)

As expected, it is not responding correctly to these either. Let's reset the data, continue training, and then re-test.

>>> n.swapData()
Swapping training and testing sets...
60 training patterns, 19 test patterns
>>> n.train(5)
Epoch #     1 | TSS Error: 14.2245 | Correct: 0.0000
Epoch #     2 | TSS Error: 12.9923 | Correct: 0.0000
Epoch #     3 | TSS Error: 10.3263 | Correct: 0.0000
Epoch #     4 | TSS Error: 7.8044 | Correct: 0.0833
Epoch #     5 | TSS Error: 5.8430 | Correct: 0.3500
Reset limit reached; ending without reaching goal
----------------------------------------------------
Final #     5 | TSS Error: 5.8430 | Correct: 0.3500
----------------------------------------------------
>>> n.swapData()
Swapping training and testing sets...
19 training patterns, 60 test patterns
>>> n.evaluate()
network classified image #2 (eyes) as ???
network classified image #4 (eyes) as ???
network classified image #5 (eyes) as ???
network classified image #6 (eyes) as ???
network classified image #7 (eyes) as ???
network classified image #8 (eyes) as ???
network classified image #9 (eyes) as ???
network classified image #10 (sunglasses) as ???
network classified image #12 (eyes) as ???
network classified image #13 (eyes) as ???
network classified image #14 (eyes) as ???
network classified image #15 (eyes) as ???
network classified image #16 (eyes) as ???
network classified image #18 (eyes) as ???
19 patterns: 5 correct (26.3%), 14 wrong (73.7%)

Clearly the performance of the network is improving. After several more iterations of this process of reseting the data and additional training, the network will start performing well on both the training set and the testing set and learning can be stopped. Remember the goal is to achieve good performance on the training set while still being able to respond appropriately to the novel data in the testing set.

Analyze results
Once the training process is complete, we can look more closely at the activations and weights to try to understand how the network has solved the task. In the python interpreter do n.showPerformance() and observe for which images each unit is active. How are the hidden units coding for the different categories of images? For the sunglasses task there is only one hidden unit, so this is pretty straight forward. Next look at the hidden weights. What parts of the image are the hidden units focusing on? Why are these locations of the image important for the given task?

Question 4 (5 points) After you have successfully tried the sunglasses example, repeat this process (Steps 1-6 above) on a task of your choice using some subset of the faces image data. In the file images.txt explain in detail what you did in each step. Also, take a screen shot of your resulting hidden weights and save them in a file called hiddens.png so that you can refer to them in your description. Creating a successful classifier often requires an iterative process. You may find that you need more hidden units. You may find that you need more training examples. Some problems you'd like to try (such as recognizing the emotions in the images) may be quite difficult to solve.

Machine Learning Application

Question 5 (8 points) For this question you can explore something of particular interest to you related to machine learning. Be sure to put all of the relevant files in the directory myML. Explain how to run your code in the README file. Here are some suggestions, but I am open to other options, just ok it with me first:

Apply neural network learning to some other data set. Check out the UCI Machine Learning Repository for some available data sets.
Use an existing library that implements a ML method:
- Apply it to the same problem you did in question 4 and compare the results.
- Apply it to some other data set.
Implement a ML method:
- Decision trees
- Genetic algorithm

Remember to run handin63 by 11:59pm on Tuesday, Dec. 3. Also be sure to select your partner.