AI Final Project
Extension: Due May 9th by 8am


Starting point code

Use Teammaker to form your team. You can log in to this site to indicate your partner preference. Once you and your partner have specified each other, a GitHub repository will be created for your team.

Introduction

The goal of the project is to give you several weeks to explore an AI topic of your choice in more depth. In the next section are some suggestions, but feel free to consider other ideas (just be sure to discuss them with me first).

Please keep in mind the amount of time you have to do this project; you should plan to do about the same amount of work on the project each week that you would on a normal lab. In other words, don't pick a project you can finish in two days, but don't pick one that would take two months either.

Project Ideas

Here are some potential project ideas:

Or come up with your own idea (again, be sure to run it by me before you get started). All the algorithms we've used this semester have a multitude of variants you can explore, and many of them can be combined with each other in interesting ways (e.g. combine a CNN with RL to play an Atari game, use a GA to evolve a board evaluation heuristic for MiniMax, etc.).

You are encouraged to make use of existing libraries (e.g. keras, scikit-learn, etc.), as well as other resources you may find on the web. However, keep in mind the standard ethics policy: outside resources are fine, so long as you use proper attribution and it's clear what work you personally did.

Machine Learning Contests

In recent years, a large amount of work in machine learning has been motivated by various contests and challenges. One of the earliest and best known was the Netflix prize (official site, Wikipedia), which offered $1M to the team that could improve the site's recommendation system by 10%. The Netflix prize was claimed in 2009; since then machine learning contests have become commonplace.

Find a machine learning challenge of your choice from kaggle. Some of these contests are currently active, with prizes available. Others are inactive, but are still interesting challenges to attempt for a project.

Kaggle competitions vary widely in what sort of data and instructions are provided. You should therefore think carefully about the competition you choose: not just "is it a cool problem?" but also "how hard will this data be to work with?" and "how clearly are the expectations of the competition defined?". Please check with me and describe your plan of attack before you get too involved in a particular contest.

In order to download data, you will need to sign up for a free account. Kaggle also has a discussion forum, which may have useful suggestions, especially if you are working on an active contest.

Project Etiquette

Some of the project ideas will likely involve large data sets that could quickly blow through your disk quota. To avoid this, you can save them to /scratch (instructions), which is unlimited, but isn't backed up.

As a general rule, /scratch is a good place for things that are large, but can be re-created if they're lost (e.g. data files, program output (if it's big), etc.). You should still keep your source code in your home directory (and in GIT). Definitely don't add giant data files to your GIT repo, though.

Also, take a look at the department's suggestions for long running jobs. As that page suggests, the screen program is very helpful, but remember that your screen sessions will last until you manually end them, so try to avoid leaving dozens of abandoned instances of screen on a server.

Project Checkpoint

By the end of the first week, you need to have turned in the project checkpoint in the file checkpoint.tex. Details about what to include in your checkpoint are provided in the file. We will then take time during lab that day (4/29) for each group to describe their project idea to the class in 3-5 minutes.

Project Report

In the LaTeX file, project.tex, you will describe your project. This file already contains a basic structure that you should follow. Feel free to change the section headings, or insert additional sections. Recall that you use the command pdfLaTeX to convert the LaTeX into a pdf file.

Your project repo contains a file called rubric.md. This gives a detailed account of how your project will be graded.

Submitting

Before the deadline, you need to submit the following things through git:

As your project develops and you create more files, be sure to use git to add, commit, and push them. Run: git status to check that all of the necessary files are being tracked in your git repo.

In addition, you must turn in a printed copy of your project writeup outside my office by the deadline.