- Understand how novelty search operates at a detailed enough level to successfully implement it.
- Develop unit tests to ensure that your implementation operates correctly.
- Develop analysis tools to monitor search results.
- Use randomness to generate larger-scale tests, and experiment with search parameters.

Use the github server to get the starting point notebook for this lab.

Novelty search is a method of searching for a variety of solutions rather than trying to optimize a particular goal. To do this, novelty search maintains an *archive* of behaviors, comparing current behaviors against the saved ones to determine whether a current behavior is novel enough to add to the archive. A current behavior's novelty is defined as the **average distance** to its **k** nearest neighbors in the archive.

We will be representing a *behavior* as a **list of points**, where points may be of some arbitrary dimension, and the list may be of some arbitrary length. For example, consider the coverage task that we worked on in the previous lab. One way to represent the behavior of a robot in this task is to record the **new** grid locations it visits in order. The length of this particular behavior space will be the total number of grid locations, which for a 15x15 grid is 225. The length of each point in this behavior space will be 2, representing the (x, y) location of each grid square.

**constructor**: Create class variables to store the**archive**as an initially empty list, the**k**(an integer) to be used in calculating nearest neighbors, the**limit**(an integer) for the maximum size of the archive, the**threshold**(a float) for how novel an example has to be before it will be added the archive, and the**max_dist**(a number) for the maximum possible distance between any two behaviors in the archive.**point_dist**: Returns the Euclidean distance between points**p1**and**p2**, which are assumed to be lists of equal length. To calculate distance, sum up the squared differences between each component of the points. Return the square root of this sum.*This method should work for points of any length.***behavior_dist**: Returns the sum of the point distances between behaviors**b1**and**b2**, which are assumed to be lists of points of equal length.*This method should work for behaviors of any length.***k_nearest_dist**: For the given**behavior**, returns the sum of the behavior distances to its**k**nearest neighbors in the archive.**sparseness**: For the given**behavior**, returns the**normalized**average distance (based on the**max_dist**) to its**k**nearest neighbors in the archive. If the archive is empty, then return 1.0. If the archive has less than**k**behaviors, the use the archive's current size to compute the average distance. The sparseness of a behavior will represent its novelty.**check_archive**: This method checks whether the given**behavior**is novel enough to add to the archive and returns its**sparseness**. The parameter**other_info**can be used to store additional information along with the behavior in the archive. This will be helpful when doing analyses of the evolutionary run, once it completes. The archive is represented as a list of tuples, where each tuple contains a behavior, its sparseness, and the its other associated information. The behavior at the front of the archive is the oldest, and the behavior at the end of the archive is the newest. When the size of the archive is less than**k**, then add the given behavior to the archive. Otherwise check if the**sparseness**of the given behavior is greater than the**threshold**, and if so, add the behavior to the archive. After adding a behavior, test if the archive size has exceeded the**limit**, and if so, remove the oldest behavior from the archive.**plot_behaviors**and**plot_archive**: You will implement these methods later.

Once you complete your first pass at the implementation, use git to add, commit, and push your changes.

In [1]:

```
import random
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
```

In [2]:

```
class NoveltySearch(object):
def __init__(self, k, limit, threshold, max_dist):
pass
def point_dist(self, p1, p2):
pass
def behavior_dist(self, b1, b2):
pass
def k_nearest_dist(self, behavior):
pass
def sparseness(self, behavior):
pass
def check_archive(self, behavior, other_info=None):
pass
def plot_growth(self):
pass
def plot_behaviors(self, max_x=1.0, max_y=1.0):
pass
```

Test your implementation by creating some sample behaviors where all points are two dimensions and have coordinates in the range **[0, 1]**, and all behaviors contain four points. Determine the maximum possible distance between such points. You'll also need to determine a good threshold based on your test behaviors.

- Create an instance of the
**NoveltySearch**class where**k**is 2, the**limit**is 3, and the**threshold**and**max_dist**are set to appropriate values. - Create at least 5 different behaviors. Remember that a behavior is defined as a list of points. Create two pairs of behaviors that are similar to one another (but not to the other pair) and one behavior that is unique from all of the rest.
- Use
**assert**to verify that all of your methods work as you expect them to. - Demonstrate that after calling
**check_archive**on all of the sample behaviors, the archive retains one example from each pair along with the most unique pattern.

In [3]:

```
def unit_tests():
pass
unit_tests()
```

Ensuring that the outcome of a machine learning process is correct can be tricky because it may incorporate randomness that leads to different results for each run. It is often the case that you will need to implement additional functionality to assist in analyzing the results. We will use **matplotlib**'s **pyplot** to create several figures to help us monitor the search.

**plot_growth**: Plots the growth of the archive over time. In order to accomplish this you'll need to add a few more class variables. In the constructor, initialize the class variables**self.steps**to zero and**self.growth**to the empty list. Then in the**check_archive**method, increment**self.steps**every time this method is called. Also, whenever an addition is made to the archive, append to the**self.growth**list a tuple of the current step and the current size of the archive.**plot_behaviors**: Plots all of the behaviors that are currently in the archive. It should first assert that the archive is not empty and that the points within the behaviors have just two dimensions. The**max_y**and**max_y**parameters define the maximum possible x and y coordinates within the behavior space.

Once you complete the analysis tools, use git to add, commit, and push your notebook.

In order to gain more intuition about how novelty search operates, you will explore how various parameters settings affect the contents of the archive. Let's focus on the **threshold** parameter since it is crucial in determining when a behavior is novel enough to be added to the archive.

Complete the function **random_2d_behavior** that returns a behavior of the given **length** as a list of 2d points in the range **[0,1]**.

Next, complete the function **randomized_tests**. Create an instance of **NoveltySearch** with a **k** of 3 and a **limit** of 10. Let's use behaviors of length 3. You'll need to determine the **max_dist** for these behaviors, and you will experiment with the threshold setting. Loop 500 times, generating a random behavior and calling the **check_archive** method. After the loop, call the **plot_growth** and **plot_behaviors** methods to see the results.

Recall that the sparseness values are normalized, so the threshold setting should be a value in the range **[0,1]**. Smaller thresholds should lead to the archive reaching the limit in size more quickly. For high enough thresholds, the archive may never reach its size limit. What threshold value(s) seem to strike the best balance of using the capacity of the archive at a reasonable pace? You should run multiple experiments at each threshold setting. Include a markdown table showing your results.

In [4]:

```
def random_2d_behavior(length):
pass
def randomized_tests():
pass
randomized_tests()
```

Be sure to save this notebook to the repository.