Lab 4: General Game Playing
Due Feb. 20 by midnight

General game players are systems able to accept descriptions of arbitrary games at runtime and able to use such descriptions to play those games effectively without human intervention. In other words, they do not know the rules until the games start. Unlike specialized game players, general game players cannot rely on algorithms designed in advance for specific games.

Introduction

In this lab you will use Monte Carlo tree search to implement a general game playing agent. The primary Python file you will be modifying is MonteCarloTreeSearch.py. You have also been provided the following files:

All three games are played using PlayGame.py, which should be familiar from the last lab. You can use the -h option for more information about its interface.

To get started, try playing against your partner in one of the new games.

./PlayGame.py hex human human --show

Implementation Tasks

  1. Node class: You will need to implement the Node.UCBWeight() method used in node selection and the Node.updateValue() method used for value backpropagation. It is up to you how to define the value of a node, but it should roughly correspond to a winning percentage. The only strict requirements are (1) that wins are better than draws and draws are better than losses, and (2) that value is always positive (this is necessary for UCBWeight()). You may also wish to store some extra information in the Node class to facilitate these computations.

  2. MCTS: Your primary implementation task is to code the Monte Carlo tree search algorithm. The MCTS() function takes a node from which to start the search, and a number of rollouts to perform. Each rollout navigates explored nodes via UCB sampling until it reaches the frontier, then expands one new node, then performs a random playout to a terminal state, then propagates the outcome back to expanded nodes along the path. When all rollouts are complete, the function deterministically returns the best move available from its start node. Pseudocode for MCTS is provided in the next section. You will certainly want to break this task down using several helper functions.

Monte Carlo Tree Search

Each time it is your agent's turn to move, your agents getMove function should call MCTS(), passing in a node from which to start the search. Your MCTS() function should implement the following pseudocode.
function MCTS(root)
   for i = 1 : rollouts
      node = root
      initialize empty path
      # selection
      while all children of node are expanded and node is not terminal
         node = choose child by UCB sampling
         add node to path
      # expansion
      if node not terminal
         pick random unexplored move
         create a new_node for the resulting state
         add new_node to node's children and the path
         # simulation
         outcome = random_playout(child state)
      else
         outcome = node's state
      # backpropagation
      update values, visits along root to node path according to outcome


helper function UCB_sample(node)
    weights = [UCB_weight for each of node's children]
    distribution = normalize(weights)
    return random sample from distribution


helper function random_playout(state)
    while state is not terminal
        state = random successor of state
    return state


helper function backpropagation(path, outcome):
    for each node in path
        increment node's visits
        update node's value according to outcome's winner
UCB_weight is calculated according the formula in the node selection section of mcts.ai.

Extensions

When you have completed the above implementation tasks, you are encouraged to try some of the following extensions:

Submitting your code

Before submitting, ensure that you have added your name(s) to the top-level comments. To submit your code, you need to use git to add, commit, and push the files that you modified.
git add MonteCarloTreeSearch.py
git commit -m "final version"
git push