- - - - - - - -
\ · · · · · · · · \
\ · · · · · · · · \
\ · · · · · · · · \
\ · · · · · · · · \
\ · · · · · · · · \
\ · · · · · · · · \
\ · · · · · · · · \
\ · · · · · · · · \
- - - - - - - -
- - - - - - - -
\ · · · · · · ● ● \
\ · · · · · · ● ● \
\ · · · · · · ● · \
\ · · · · · ● · · \
\ · · · · · · · · \
\ · ● · · · · · · \
\ · · · ● · · · · \
\ · · · · · · · · \
- - - - - - - -
- - - - - - - -
\ · · · · · · ● ● \
\ · · · · · · ● ● \
\ · · · · · · ● ● \
\ · · · · · ● ● ● \
\ · · · ● ● ● ● · \
\ ● ● ● ● ● · ● · \
\ ● ● ● ● · · · · \
\ · · · · · · · · \
- - - - - - - -
General game players are systems able to accept descriptions of arbitrary games at runtime and able to use such descriptions to play those games effectively without human intervention. In other words, they do not know the rules until the games start. Unlike specialized game players, general game players cannot rely on algorithms designed in advance for specific games.
As in the previous lab, use Teammaker to form your team. You can log in to that site to indicate your partner preference. Once you and your partner have specified each other and the lab has been released, a GitHub repository will be created for your team.
The objective of this lab is to use Monte Carlo Tree Search to implement a general game playing agent.
The primary Python file you will be modifying is MonteCarloTreeSearch.py. You have also been provided several files that should look familiar from lab 3:
There are also several new python programs:
And finally, you should have one "compiled" python program:
Just like last week, all games are played using PlayGame.py. The interface has been updated slightly; you can use the -h option for more information.
To get started, try playing hex against your partner. This game has a large branching factor, so you'll likely have to scroll up to see the game board between turns. The game is played on a grid where (0,0) is a the top left corner and (7,7) is the bottom right corner.
./PlayGame.py hex human human
To begin, open the file MonteCarloTreeSearch.py and complete the implementations of the two classes provided.
value = 1 + (wins-losses)/visitsThe benefits of this formula are that:
Here's pseudocode that fleshes out these steps:
getMove(game_state)
# Find or create node for game_state
key = str(game_state)
if key in tree
curr_node = get node from tree using key
else
curr_node = create new node with game_state
add curr_node to tree using key
# Perform Monte Carlo Tree Search from that node
MCTS(curr_node)
# Determine the best move from that node
bestValue = -float("inf"); bestMove = None
for move, child_node in curr_node's children
if curr_node's player is +1
value = child_node.value
else
value = 2 - child_node. value
if value > bestValue
update bestValue and bestMove
return bestMove
./PlayGame.py nim mcts randomHere's an example status that might be printed for the root node after the first turn:
node wins 988, losses 12, visits 1000, value 1.98 child wins 0, losses 2, visits 2, value 0.00, move 3 child wins 7, losses 4, visits 11, value 1.27, move 1 child wins 981, losses 7, visits 988, value 1.99, move 2Notice that the best move based on the rollouts is to take 2, which puts our opponent at 5 pieces. We saw in class that, with optimal strategy, playing from 5 pieces is a guaranteed loss. MCTS has also discovered this via the rollouts.
Each rollout:
Pseudocode for MCTS is provided below:
MCTS(current_node)
repeat num_rollout times
path = selection(current_node)
selected_node = final node in path
if selected_node is terminal
outcome = winner of selected_node's state
else
next_node = expansion(selected_node)
add next_node to end of path
outcome = simulation(next_node's state)
backpropagation(path, outcome)
status(current_node) # use for debugging
You will certainly want to break this task down using several helper
methods, at least one for each phase of the algorithm.
For any extension that you try, describe what you did and the outcome in the file called extensions.md.