Lab 4: General Game Playing
Due March 6 by midnight

 - - - - - - - -
\ · · · · · · · · \
 \ · · · · · · · · \
  \ · · · · · · · · \
   \ · · · · · · · · \
    \ · · · · · · · · \
     \ · · · · · · · · \
      \ · · · · · · · · \
       \ · · · · · · · · \
          - - - - - - - -


 - - - - - - - -
\ · · · · · ·   \
 \ · · · · · ·  ● \
  \ · · · · · ·  · \
   \ · · · · ·  · · \
    \ · · · · · · · · \
     \ ·  · · · · · · \
      \ · · ·  · · · · \
       \ · · · · · · · · \
          - - - - - - - -


 - - - - - - - -
\ · · · · · ·   \
 \ · · · · · ·   \
  \ · · · · · ·   \
   \ · · · · ·    \
    \ · · ·     · \
     \      ·  · \
      \     · · · · \
       \ · · · · · · · · \
          - - - - - - - -

General game players are systems able to accept descriptions of arbitrary games at runtime and able to use such descriptions to play those games effectively without human intervention. In other words, they do not know the rules until the games start. Unlike specialized game players, general game players cannot rely on algorithms designed in advance for specific games.

Starting point code

As in the previous lab, use Teammaker to form your team. You can log in to that site to indicate your partner preference. Once you and your partner have specified each other, a GitHub repository will be created for your team.

Introduction

Before you begin, copy over your MinMaxPlayers.py, Heuristics.py, and (if you implemented it) TournamentPlayers.py from last week. If you did not implement TournamentPlayers.py, a basic version has been provided for you. You'll need to edit this to make it viable for game play. This will allow you to play your MCTS agent (once you have implemented it) against last week's alpha/beta pruning agents, for the games Mancala and Breakthrough.

In this lab you will use Monte Carlo tree search to implement a general game playing agent. The primary Python file you will be modifying is MonteCarloTreeSearch.py. You have also been provided several files that should look familiar from lab 3:

There are also two new python programs:

And finally, you should have one "compiled" python program:

Just like last week, all games are played using PlayGame.py. The interface has been updated slightly; you can use the -h option for more information.

To get started, try playing hex against your partner. This game has a large branching factor, so you'll likely have to scroll up to see the game board between turns. The game is played on a grid where (0,0) is a the top left corner and (7,7) is the bottom right corner.

./PlayGame.py hex human human

Implementation Tasks

  1. Node class: You will need to implement the Node.UCBWeight() method used in node selection. The UCB_weight is calculated according the formula in the node selection section of mcts.ai. This class will also need to implement the Node.updateValue() method used for value backpropagation. It is up to you how to define the value of a node, but it should roughly correspond to a winning percentage. The only strict requirements are that: You may also wish to store some extra information in the Node class to facilitate these computations, such as number of wins, and a list of untried moves. Make sure you create a COPY of the available moves when initializing the untried moves.

  2. MCTS: Your primary implementation task is to code the Monte Carlo tree search algorithm. The MCTS() function takes a node from which to start the search, and a number of rollouts to perform. Each rollout: When all rollouts are complete, the function deterministically returns the best move available from its start node. Pseudocode for MCTS is provided in the next section. You will certainly want to break this task down using several helper functions.

Monte Carlo Tree Search

Each time it is your agent's turn to move, your agents getMove function should call MCTS(), passing in a node from which to start the search. Your MCTS() function should implement the following pseudocode.
function MCTS(root_node)
  loop for number of rollouts
    node = root_node
    initialize path with node
    # selection
    while all children of node are expanded and node not terminal
       node = choose child with max UCB value
              (based on parent player's perspective)
       add node to path
    # expansion
    if node not terminal
       pick random unexplored move
       apply move to the node's state to get a next_state
       if str(next_state) in tree's nodes
          find next_node in tree's nodes using key str(next_state) 
       else
          create a next_node for the next_state
          add next_node to graph's nodes using key str(next_state)
       add the next_node to the node's children using move as key
       add the next_node to the path
       # simulation
       outcome = random_playout(next_node's state)
    else
       outcome = node's state winner
    # backpropagation
    update values for all nodes in path based on outcome

Extensions

When you have completed the above implementation tasks, you are encouraged to try some of the following extensions:

For any extension that you try, describe what you did and the outcome in a file called extensions.txt.

Submitting your code

Before submitting, ensure that you have added your name(s) to the top-level comments. To submit your code, you need to use git to add, commit, and push the files that you modified.