CS81 Lab5: Unsupervised Categorization

Due by noon next Friday
Introduction

In this lab you will observe a robot as it moves through a two-room environment. On each time step it will check its sonar sensors and create a vector representing the minimum sonar values on its left, front, right, and back. It will pass this vector of sensor values to an unsupervised categorization method. You will record the kinds of categories that are formed. You will experiment both with the categorization method's parameters and with the robot's mode of exploration to see how they affect the number and type of categories that are discovered.


Resource Allocating Vector Quantizer (RAVQ)

You will use a RAVQ, which was developed by Fredrik Linaker and Lars Niklasson, as the unsupervised categorization method. Given a time series of vectors, the RAVQ generates a set of model vectors that represent consistent categories of vectors within that time series. Resource allocating means that the number of categories is not fixed, but is dynamically determined as the unsupervised learning proceeds.

The RAVQ consists of two main components:

  1. Input buffer
  2. Set of model vectors
When the RAVQ begins, the buffer must be initialized by filling it with the first n inputs, where n is the buffer size. Then a moving average can be calculated. After the buffer is full, at each step the current input is added to the buffer and the oldest input in the buffer is deleted, maintaining the size of the buffer at n. A moving average vector is calculated by averaging all the inputs currently in the buffer. Then the RAVQ determines whether the current moving average is a member of an existing category or if it qualifies as a new model vector. To do so the moving average must meet two criteria. It must be a good representation of the inputs and it must be unique enough when compared to the existing set of model vectors.

There are three key RAVQ parameters that must be set before learning begins. Each of these parameters will affect the number of categories that will be created.

  1. Buffer size
    This value determines the number of input vectors that are stored in the buffer. The size should reflect the probable rate of change within the environment. A small buffer size may lead to spurious categories that are based on noise. A large buffer size will cause the moving average to be quite stable and may lead to very few categories.
  2. Epsilon: Stability criterion
    This value determines how close the moving average must be to the input buffer vectors to qualify as a possible new model vector. A small epsilon means that the vectors that make up the moving average must be nearly identical in order to be considered as a potential category. A large epsilon means that the vectors that make up the moving average could be quite dissimilar and still be considered as a potential category.
  3. Delta: Mismatch reduction requirement
    This value determines how different a moving average vector must be from all existing model vectors to justify creating a new model vector. A small delta means that the moving average need not be very different from existing model vectors in order to create a new category. A large delta means that the moving average must be significantly different from the existing model vectors to create a new category.

If you'd like to see the implementation of the RAVQ go to:

/usr/local/pyrobot/brain/ravq.py


Getting started

1. Categorizing while wall following

2. Categorizing while wandering

Submit

When you are done run handin81 to turn in your completed lab work.