CS81 Lab 4: Novelty Search

Due by noon next Friday

In this lab you will implement novelty search, introduced by Joel Lehman and Kenneth O. Stanley, who argue that objective fitness functions can be deceptive, leading an evolutionary search to local maxima rather than towards the desired goal. In contrast, novelty search ignores the objective and simply searches for novel behavior, and therefore cannot be deceived.

In this lab we will focus on novelty search in isolation, without using it as part of an evolutionary search. We will explore how it operates on randomly generated behaviors. In the next lab we will apply novelty search to a robot control problem.

Git started

Go through the following steps to setup your directory for this lab.

  1. First you need to run setup81 to create a git repository for the lab.

    If you want to work alone do:

    setup81 labs/04 none
    If you want to work with a partner, then one of you needs to run the following while the other one waits until it finishes.
    setup81 labs/04 partnerUsername
    Once the script finishes, the other partner should run it on their account.

  2. For the next step only one partner should copy over the starting point code.
    cd ~/cs81/labs/04
    cp -r ~meeden/public/cs81/labs/04/* ./
    

  3. Whether you are working alone or with a partner, you should now add all of the files to your git repo, commit, and push them as shown below.
    git add *
    git commit -m "lab4 start"
    git push
    

  4. If you are working with a partner, your partner can now pull the changes in.
    cd ~/cs81/labs/04
    git pull
    

Implementing novelty search

Before trying to implement novelty search, review the details of the algorithm starting at the bottom of page 12 in the paper Abandoning objectives: Evolution through the search for novelty alone that we discussed this week.

Recall that novelty search maintains an archive of behaviors, comparing current behaviors against the saved ones to determine whether a current behavior is novel enough to add to the archive. A current behavior's novelty is defined as the average distance to its k nearest neighbors in the archive.

We will be representing a behavior as a list of points, where the points may be of some arbitrary dimension, and the list may be of some arbitrary length. For example, consider the coverage task that we worked on in the previous lab. One way to represent the behavior of an agent in this task is to record the new grid locations it visits in order. The length of this behavior space would be the total number of grid locations, which for a 15x15 grid is 225. The length of each point would be 2, representing the x, y location of the grid square.

In the file noveltySearch.py, implement a python class called NoveltySearch with the following methods:

Each method of your class should include a triple-quoted comment indented underneath the def line which provides details about its parameters, return value (if applicable), and its purpose.

You've implemented the basic methods needed to do novelty search. Use git to add, commit, and push your changes.

Unit testing individual methods

In the function unitTests write code that allows you to individually test each of the methods in the following order:

  1. distance
  2. distFromkNearest
  3. sparseness
  4. checkArchive

Create test cases for which you determine the correct result and then confirm that your code responds appropriately. Be sure to test different lengths of behaviors and different lengths of points.

You will be evaluated on the thoroughness of your unit tests.

Use git to add, commit, and push your changes.

Adding analysis tools to novelty search

Analyzing the outcome of a machine learning process can be tricky because it may incorporate randomness that leads to different results for each run. It is often the case that you will need to implement additional functionality to assist you in analyzing the results. In this case we are going to add a method to save the archive, allowing us to plot the novel behaviors found, as well as a method to save a history of the archive's growth over time, allowing us to determine good settings for the novelty threshold.

Add more unit tests to your main program for these two new methods.

Use git to add, commit, and push your changes.

Experiment with novelty search parameters

In order to gain more intuition about how novelty search operates, you will explore how various parameter settings affect the contents of the archive. Let's focus on the threshold parameter since it is crucial in determining when a behavior is novel enough to be added to the archive.

In your main program, instantiate a novelty search with a k of 5, a limit of 30, a pointLength of 2, and a behaviorLength of 3. Now we will try to determine a good threshold value. Let's assume that our point values will be random numbers in the range [0, 1].

Given these parameter settings what is the maximum sparseness value possible? Figure this out and write it as a comment in your main program.

Based on this maximum value, pick a threshold in the range [0, max]. Loop 1000 times, generating a random behavior and calling checkArchive on that behavior. After the loop, save the archive and the growth. Run the program and check the results. Smaller thresholds should result in more additions, with archive hitting its limit in size more quickly. For larger thresholds the archive may never reach its size limit. What threshold value seems to strike the best balance of using the capacity of the archive at a reasonable pace? Again, write your answer as a comment.

Submit

When you are completely done, be sure that you have pushed all of your changes. Use the git status command to verify this. If there are uncommitted changes, be sure to do another add, commit, and push cycle.