CS81 Lab4: Intelligent Adaptive Curiosity

You do not need to turn anything in for this lab

To begin, run update81 to copy the starting point files into your home directory (cs81/labs/4/).

Introduction

Intelligent Adaptive Curiosity (IAC) was developed by Pierre-Yves Oudeyer, Frederic Kaplan, and Verena Hafner with the goal of providing a robot with an intrinsic motivation system that pushes it to focus on situations that maximize its learning progress. IAC contains a memory that is sub-divided into sensorimotor regions. Each region contains:

In the original description, IAC's memory is organized hierarchically as a tree. Initially all exemplars are members of a single region. When the region grows too large it is split into two new regions. The split is made based on a feature of the SM(t) component of the exemplars such that the sum of the variances of the S(t+1) component, weighted by the number of exemplars in that new region, is minimized. This process continues recursively on any region that grows too large.

The main processing loop of IAC works as follows:


A simple IAC experiment

In their paper Intrinsic motivation systems for autonomous mental development, Oudeyer, Kaplan, and Hafner describe an experiment that is designed to test the effectiveness of IAC's goal of maximizing learning progress. One part of the domain is easy to learn, one part of the domain is more complex, and another part is unlearnable. We should expect to see IAC first focus on the easy portion, then focus on the more complex portion, while ignoring the unlearnable portion.

The robot is placed in an environment with a smart toy. The robot can command it's left and right wheels and also emit a frequency that makes the toy move. When the frequency is:

The robot senses its distance from the toy. Therefore SM(t) consists of four items: leftWheelSpeed, rightWheelSpeed, frequency, and currentDistance and S(t+1) consists of one item: nextDistance.

The paper describes an experiment that lasts for 5000 epochs. In the first 250 time steps, the robot emits all three frequencies equally. This is when the initial region first splits. Next the robot emits both the second and third frequencies, but then another split occurs at time step 500. From then until about epoch 3000, the robot emits the third frequency (the easiest one to predict) about 90% of the time. From that point on it focuses on emitting the second frequency (the more complex one to predict) about 85% of the time. It never spends more that 10% of the time on the first frequency, which is unlearnable.


Reproducing the experiment

In reproducing the above experiment, a number of decisions must be made:

To start the experiment do: python iac.py

This will generate a number of data files recording the percentage of time the robot emits each of the three frequencies during every 50 time steps. These data files are updated continuously and can be viewed while the experiment is running by doing: xgraph -ly 0,1 -P *.data