CS91 Building Robots
Lab Assignment Weeks 9-12

For this lab you will program your robot to periodically seek and avoid light. Then you will use reinforcement learning to teach your robot the same task. Finally you will compare the results of the programmed solution and the learned solution.

The Task

There is a reactive component and a goal-based component to this task. For the reactive portion, your robot must constantly keep moving and avoid getting stuck against obstacles. For the goal-based portion, your robot must either seek or avoid the light based on the current goal. As soon as the sum of its light sensors go above some designated maximum value, then it has succeeded at seeking the light. Similarly as soon as this sum falls below some designated minimum value, then it has succeeded at avoiding the light. You will have to experiment with your robot to find reasonable minimum and maximum values so that the task can be accomplished but is also not too easy. Use a tail light to indicate which goal is currently being attempted.

Running Robots in Tethered Mode

1. First you must move the sensors and motors to the following ports to match the program's expectations:

	Left light 	analog 0
	Right light	analog 5
	Left touch	digital 0
	Right touch	digital 5
	Left motor	motor 4
	Right motor	motor 2
	Tail light	motor 1

2. Download HEXMON40 to the miniboard and leave serial port connection in place.

3. Put miniboard into run mode.

4. At the top level directory execute the program LEARN.EXE. It will prompt you for some run time parameter values.

filename		The filename for saving information from the learning.
			Use team_directory_name\learn_n where n is a number.
			Will create .CAR, .MSG, and .WTS files.

execute on robot?	y for yes

specific seed?		n for no

epochs			The number of training steps.  Remember that one step
			takes about a second, so 300 steps is about 5 minutes.

report rate		Use the same number as you used for the epochs.

do learning?		y for yes

periodic training?	n for no
			In discontinuous training, the robot is limited to
			50 steps for trying to achieve each goal.

saved weights?		The first time you use this say n for no, but in 
			subsequent tests you may not want to start from 
			scratch.

Evaluation

After learning has ended, open up the file with the .MSG extension and go to the end. You should record in your lab notebook the summary of the punishment received. If you continue to learn with these weights, the amount of time punished should decrease.

Once the robot has gotten fairly successful at the task describe its behavior. What kind of strategy is it using to seek the light? To avoid the light? You can use the .CAR file to see a summary of the motor commands that are being used. How similar are the learned strategies and the programmed strategies? Which are more successful?