AI Final Project
Due May 5 (Sunday) by midnight

Starting point code

Use Teammaker to form your team. You can log in to that site to indicate your partner preference. Once you and your partner have specified each other, a GitHub repository will be created for your team.

Introduction

The goal of the project is to give you several weeks to explore an AI topic of your choice in more depth. If you are a CS major, this project may serve as the basis for satisfying your senior comprehensive. In the next section I have offered some suggestions, but feel free to propose other ideas (just be sure to discuss them with me first).

Project Ideas

Here are some potential project ideas:

Machine Learning Contests

In recent years, a large amount of work in machine learning has been motivated by various contests and challenges. One of the earliest and best known was the Netflix prize (official site, Wikipedia), which offered $1M to the team that could improve the site's recommendation system by 10%. The Netflix prize was claimed in 2009; since then machine learning contests have become commonplace.

Find a machine learning challenge of your choice from kaggle. Some of these contests are currently active, with prizes available. Others are inactive, but are still interesting challenges to attempt for a project.

Kaggle competitions vary widely in what sort of data and instructions are provided. You should therefore think carefully about the competition you choose: not just "is it a cool problem?" but also "how hard will this data be to work with?" and "how clearly are the expectations of the competition defined?". The following list of contests are feasible options that would likely be good candidates for using convolutional deep learning networks, but you are welcome to explore others as well. Please check with me and describe your plan of attack before you get too involved in a particular contest.

In order to download data, you will need to sign up for a free account. Kaggle also has a discussion forum, which may have useful suggestions, especially if you are working on an active contest.

You are encouraged to make use of existing machine learning libraries such as scikit-learn and keras. Keras is an excellent resource for neural networks; scikit-learn should be your go-to library for all other machine learning algorithms.

Project Etiquette

Some of the project ideas will likely involve large data sets that could quickly blow through your disk quota. To avoid this, you can save them to /scratch (instructions), which is unlimited, but isn't backed up.

Also, take a look at the department's suggestions for long running jobs.

Submitting

Before the deadline, you need to submit the following things through git:

In addition, you must turn in a hard copy of the writeup pdf outside my office.

In the LaTex file, project.tex, you will describe your project. This file already contains a basic structure that you should follow. Feel free to change the section headings, or insert additional sections. Recall that you use pdflatex to convert the LaTex into a pdf file.

As your project develops and you create more files, be sure to use git to add, commit, and push them. Run: git status to check that all of the necessary files are being tracked in your git repo. Don't forget to update the README so that I can test your code!