Lab 6: Convolutional Neural Networks

Lab 6: Convolutional Neural Networks
Due October 29 by midnight

Starting point code

Use Teammaker to form your team. You can log in to that site to indicate your partner preference. Once you and your partner have specified each other, a GitHub repository will be created for your team.

Introduction

In this lab you will be using Keras to explore convolutional neural networks (note that this is not a statement about the relative value of Keras and PyTorch; I just didn't want to make you learn both of them at once, so I picked one). You will start by experimenting with an already constructed ConvNet to learn the MNIST dataset. Once you understand how to use Keras to construct, train, and test a ConvNet, you will create your own ConvNet to classify images of dogs and cats.

Relative to all of the previous labs in this class, much of the design of the lab is left up to you. This is intended to start to prepare you for your upcoming AI project. You will likely be using some type of publicly available software, like Keras, to complete your project. You should get used to consulting the documentation when you have questions. I am also willing to help, of course.

Also note that this lab requires a more involved writeup than the previous ones did; again, this is a step towards preparing for your final project. When planning your work, expect to budged significant time for working on the writeup.

Using Keras

Before using Keras, you need to activate a virtual environment that contains all of the software packages that you'll need:

source /usr/swat/bin/CS63-10.1

To leave the virtual environment, just type:

deactivate

Deep learning software works best on machines with GPUs that have lots of available memory. Here is a summary of the CS machines with the best GPUs. When doing longer experiments you should ssh into one of these machines like this:

ssh -X username@machinename.cs.swarthmore.edu

The -X enables X forwarding so that any windows that pop up will show up on the machine you sshed from. Note that this will only really work properly if you've got an X server running on your local machine (this is true by default if you're running linux, but will take some extra work under other operating systems).

Another useful command is nvidia-smi, which will give you info on the status of the GPU in the machine you're logged into. If you want to watch the status change, try

nvidia-smi -l 1

which will run the command in a loop once every second (you'll need to use Ctrl-c to end the command when you're done). You may want to have two terminals logged into the same machine, and run nvidia-smi in one while running your program in the other.

MNIST: Classifying handwritten digits

Open up the file mnist_net.py and read through the code. Using a Keras network will typically require the steps listed below. Try to identify how each of these steps is being handled for learning the MNIST dataset.

Prepare the data
This may include loading, reshaping, normalizing, and dividing the data into separate training and testing sets.
Construct the model
We will be using Sequential() models. Add the layers in order from input to output.
Compile the model
Specify the optimizer, loss, and metric that will be used.
Train the model
Determine the number of epochs to complete.
Test the model
After training is complete, analyze the results.

Typically the network's performance will not be as successful as you expected. To improve the performance you will need to iterate over these steps multiple times. In some cases, you may need to pre-process the data in some way to simplify the problem. Or you may need to tweak the model architecture, by adding more layers, or changing the parameters within existing layers. If your inputs are in the range [-1,1] rather than [0, 1], you may need to use a different activation function, like tanh. You might get better results with a different optimizer or loss function. If loss is still dropping, you may just need to train the model for more epochs. Be sure to look at the keras docs to learn what the options are (e.g. what optimizers are available and how do they differ, etc.).

Test out the effects of changing each hyper-parameter separately first to get a feel for what each one does, and then try combining them in different ways. Try to make your exploration systematic, rather than just trying random values, and be sure to take notes of your experiments for your writeup.

Try to improve the performance on the MNIST data set as much as you can before you move on to the more challenging problem below.

Classifying Images of Dogs and Cats

Next we will explore a dataset from a Kaggle competition called Dogs vs Cats. Here is a description of dataset (excerpted from Kaggle):

Web services are often protected with a challenge that's supposed to be easy for people to solve, but difficult for computers. Such a challenge is often called a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) or HIP (Human Interactive Proof). HIPs are used for many purposes, such as to reduce email and blog spam and prevent brute-force attacks on web site passwords.

Asirra (Animal Species Image Recognition for Restricting Access) is a HIP that works by asking users to identify photographs of cats and dogs. This task is difficult for computers, but studies have shown that people can accomplish it quickly and accurately.

Asirra is unique because of its partnership with Petfinder.com, the world's largest site devoted to finding homes for homeless pets. They've provided Microsoft Research with over three million images of cats and dogs, manually classified by people at thousands of animal shelters across the United States.

The competition website notes that machine classifiers can now score above 80 percent accuracy on this task, so it is no longer considered a safe form of CAPTCHA. You will explore how well you can do using a convolutional deep neural network to solve this task.

Open up the file dogscats_net.py and read through the code that has been provided. I have created a directory containing nearly 25,000 images here:

    /home/mitchell/public/cs63/dogs-vs-cats/

Do not copy these images to your own directory! You can load the images from this directory. But, if you ever do want to store a large dataset, you should save it to /scratch/yourusername/. Files stored here will not count against your quota.

The images have been re-sized and padded so that they all have a uniform shape of (128, 128, 3). I have provided a read_data function that will read in a specified number of these images, and return a list of input vectors and a list of target vectors that you can use to build your training and testing datasets.

It is your job to come up with a network that does as good a job as you can on this dataset. In addition to the hyperparameters you experimented with for the MNIST data set, you may also want to consider using data augmentation (see here for a simple intro on doing this in tensorflow, though keep in mind you'll need to adapt it for your data and network), or maybe even transfer learning (again the tensorflow docs have a guide, though as always it's up to you to adapt those examples to your own data; even though the example uses a similar dataset to ours, you do not want to copy the code that downloads their (un-pre processed) version of the dataset into your home directory)

Writeup

In the file writeup.tex, you should describe the neural network you built and the experiments you ran. As you work, you should keep track of what network architectures and parameters you have tried and how well they classified the images.

You can edit the writeup.tex file with any text editor. To compile your latex document into a pdf, run the following command:

pdflatex writeup.tex

You can then open the pdf from the command line with the command evince. See last week's lab for more resources on LaTeX, or just google "latex [topic]" to find lots of tutorials. Feel free to use services like sharelatex to edit your latex file.

Submitting

You should use git to add, commit, and push any files that you modify.