CS 10, Spring 1998
Lab 8.2 Artificial Intelligence-- Computer Learning
Assignment for Thursday
Lab 8.2 Instructions
Copy nimgame5 from the "module8" folder on the "Classes" file server
to your disk.
The goal of this lab is to perform some experiments to evaluate
which conditions optimize the speed at which the computer learns to
win at nimgame. To do this, you'll need to try different values for the
learning parameters fwingain, flosegive, swingain, and swingive, which
are set in the script of the "Reset Everything" button.
During lab today, choose one or more of the questions
below to investigate. When you've finished, call one of us over to
discuss your results.
When evaluating how well the computer has learned, consider not only the
number of games it has won, but also how well its move matrix
represents the correct strategy for winning every time.
- Does the computer learn faster against an opponent whose moves
are determined completely at random, or against a smart opponent?
- To
see how quickly player 1 learns against a random opponent, set player
2's learning parameters (flosegive and fwingain) to 0.
- To see how player 1 fares against a smart
opponent, set player 2's learning parameters to 0, and
set the values of player 2's move matrix so that they represent
the best move to make in each case.
- Does the computer learn better from both "positive" and
"negative reinforcement" (learning both when it wins and when it loses),
or from only "positive reinforcement" (learning only when it wins)? Setting
the learning parameter flosegive to 0 will cause the computer to learn
only from wins.
- Does the computer learn better when the changes it makes to its
strategy are more gradual (this happens when the learning parameters
are closer to 0) or more drastic (this happens when the learning
parameters are closer to 1)?
- What happens when the computer learns against a pathologically
stupid player2? Will it learn the correct strategy for winning every
game, or will it learn to beat only a player which makes all the wrong
moves? To test this, set player 2's learning parameters to 0, and set
the entries in its move matrix so that player2 will always make the worst
possible move (example: when there are two sticks left, player2 will
pick up both sticks)
- What happens when two learners play against each other?
How quickly does player 1 learn when it plays against a
learner? To test this, compare what happens when player2's learning
parameters are both set to 0 (so player 2 isn't learning at all) to
what happens when player 2's learning parameters are set to the same
values as player 1's learning parameters are set to.
- How much longer does it take the computer to learn when it
plays with 10 matchsticks instead of 6? To test this, change the
value of BegPileSize from 6 to 10 in the script of "Reset Everything."