Lab 4: General Game Playing
Due Feb. 20 by midnight
General game players are systems able to accept descriptions of arbitrary games at runtime and able to use such descriptions to play those games effectively without human intervention.
In other words, they do not know the rules until the games start.
Unlike specialized game players, general game players cannot rely on algorithms designed in advance for specific games.
Introduction
In this lab you will use Monte Carlo tree search to implement a general game playing agent.
The primary Python file you will be modifying is MonteCarloTreeSearch.py.
You have also been provided the following files:
- ConnectFour.py, the same implementation of connect four you had for lab 3.
- Hex.py, which implements hex, a game where players attempt to form a connected path between opposing sides of the board.
- MysteryGame.py, which implements an unspecified game (see if you can figure out the rules).
- BoardGame.py, containing a base class for connect four, hex, and the mystery game.
- PlayGame.py, an interface for playing any of these games.
- BasicPlayers.py, human and random player classes, similar to lab 3.
- HexPlayerBryce.pyc, the "compiled" version of my hex-specific MCTS implementation for you to play against.
This file should mostly look like gibberish.
Attempting to decipher it would be considered cheating.
All three games are played using PlayGame.py, which should be familiar from the last lab.
You can use the -h option for more information about its interface.
To get started, try playing against your partner in one of the new games.
./PlayGame.py hex human human --show
Implementation Tasks
- Node class:
You will need to implement the Node.UCBWeight() method used in node selection and the Node.updateValue() method used for value backpropagation.
It is up to you how to define the value of a node, but it should roughly correspond to a winning percentage.
The only strict requirements are
(1) that wins are better than draws and draws are better than losses, and
(2) that value is always positive (this is necessary for UCBWeight()).
You may also wish to store some extra information in the Node class to facilitate these computations.
- MCTS:
Your primary implementation task is to code the Monte Carlo tree search algorithm.
The MCTS() function takes a node from which to start the search, and a number of rollouts to perform.
Each rollout navigates explored nodes via UCB sampling until it reaches the frontier, then expands one new node, then performs a random playout to a terminal state, then propagates the outcome back to expanded nodes along the path.
When all rollouts are complete, the function deterministically returns the best move available from its start node.
Pseudocode for MCTS is provided in the next section.
You will certainly want to break this task down using several helper functions.
Monte Carlo Tree Search
Each time it is your agent's turn to move, your agents getMove function should call MCTS(), passing in a node from which to start the search.
Your MCTS() function should implement the following pseudocode.
function MCTS(root)
for i = 1 : rollouts
node = root
initialize empty path
# selection
while all children of node are expanded and node is not terminal
node = choose child by UCB sampling
add node to path
# expansion
if node not terminal
pick random unexplored move
create a new_node for the resulting state
add new_node to node's children and the path
# simulation
outcome = random_playout(child state)
else
outcome = node's state
# backpropagation
update values, visits along root to node path according to outcome
helper function UCB_sample(node)
weights = [UCB_weight for each of node's children]
distribution = normalize(weights)
return random sample from distribution
helper function random_playout(state)
while state is not terminal
state = random successor of state
return state
helper function backpropagation(path, outcome):
for each node in path
increment node's visits
update node's value according to outcome's winner
UCB_weight is calculated according the formula in the node selection section of mcts.ai.
Extensions
When you have completed the above implementation tasks, you are encouraged to try some of the following extensions:
- Play your agent against itself.
- Play your agent your Connect Four TournamentPlayer.
- Play your agent your Bryce's Hex player.
- Try varying the number of rollouts.
- Try varying the UCB_CONST parameter to trade off exploration vs exploitation.
How does the optimal UCB_CONST value depend on the game?
How does it depend on the amount of prior knowledge?
The number of rollouts?
- Try saving the tree that MCTS has built for re-use in the next game.
Useful libraries for accomplishing this include json and pickle. Try playing the version with saving against one without. How much does this improve the agent's play?
- Can you determine what mystery game is implemented by game2.py?
Submitting your code
Before submitting, ensure that you have added your name(s) to the
top-level comments. To submit your code, you need to use git to add,
commit, and push the files that you modified.
git add MonteCarloTreeSearch.py
git commit -m "final version"
git push