Lab 3: Minimax

Lab 3: Minimax Search and Alpha-Beta Pruning
Due Feb. 20 by midnight

In this lab you will be writing agents that use depth-bounded Minimax search with Alpha-Beta pruning to play Mancala and Breakthrough. In Mancala, players take turns grabbing all of the stones from one house on their side of the board and sowing them counterclockwise. The objective is to end the game with the most pieces in one's scoring house. In Breakthrough, players take turns advancing pawn-like pieces on a rectangular board. The objective is to get a single piece to the opponent's end of the board. The examples below show a mid-game board state from each game.

 -----------------------------------
 |    | 0 | 7 | 1 | 0 | 1 | 0 |    |
 | 13 |-----------------------| 14 |
>|    | 0 | 1 | 3 | 3 | 5 | 0 |    |<
 -----------------------------------

 ---------
|· · · · ●|
|· · ● · ●|
|· ● ● ● ●|
|● ● · · ·|
|● · ● ● ●|
|· · ● ● ·|
|· · · · ●|
 ---------

The following Wikipedia pages have complete rulesets for each game. Note that Mancala has many, many variants, so if you have played it before, you might have used different rules.

Starting point code

As in the previous lab, use Teammaker to form your team. You can log in to that site to indicate your partner preference. Once you and your partner have specified each other, a GitHub repository will be created for your team.

Introduction

The objectives of this lab are to:

Use the bounded Minimax algorithm to play Mancala and Breakthrough.
Improve the efficiency of Minimax by adding alpha-beta pruning.
Create an informative static evaluation function for each game.
Optimize your game playing agents for an in-class tournament.

You will need to modify up to three python files:

MinMaxPlayers.py
Heuristics.py
TournamentPlayers.py

You should also look over the following files:

Mancala.py
Breakthrough.py
PlayGame.py
BasicPlayers.py

To see the available command-line options for game play try running:

./PlayGame.py -h

Then try playing Breakthrough against a really terrible opponent by running:

./PlayGame.py breakthrough random human --show

And try playing Mancala against your lab partner by running:

./PlayGame.py mancala human human --show

Try playing several games (and refer to the Wikipedia links above) to make sure you understand the rules of each game.

Game Implementations

Begin by completing the following steps:

Open the files Mancala.py and Breakthrough.py to review the methods that have been provided. Notice that both games represent the board as 2-dimensional array of integers, but otherwise have different internal semantics. In Breakthrough, blank spaces are 0s, the first player's pieces are +1s, and the second player's pieces are -1s. In Mancala, the non-scoring houses are represented by one array, and the scoring houses are represented by another. Both games provide several methods and attributes that you should make use of in your search:
- makeMove(move) returns a new instance of the game
- availableMoves gives a list of all legal moves
- isTerminal will be True if the game has ended, False otherwise
- winner gives 1 or -1, indicating whether the winner is the maximizer or minimizer
Open the files BasicPlayers.py and MinMaxPlayers.py to see how we will be implementing game-playing agents. The HumanPlayer and RandomPlayer classes are provided to make your testing easier. All players must implement a getMove() method that takes a game instance representing the current state and returns one of the legal moves from that state.

Bounded Minimax

Next focus on the MinMaxPlayers.py file, and complete the following steps:

In the file Heuristics.py, implement the basicEval method for Mancala, which is the easier one to start with.

In the file MinMaxPlayers.py, implement the MinMaxPlayer.getMove() method which should run a helper method to conduct a bounded Minimax search (the pseudocode is given below). Minimax will return the best value found for the current player. However, we need to determine the move associated with the best value found. To do this, the Minimax search will update a class variable to represent the best move found. This move is what you will need to return from the getMove() method.

bounded_min_max(state, depth)
   if depth limit reached or state is terminal
       return staticEval(state)
   # init bestValue depending on who's turn it is
   bestValue = turn * -infinity 
   for each move from state:
       determine the next_state
       # Recursive call
       value = bounded_min_max(next_state, depth+1)
       if player is maximizer
          if value > bestValue
             bestValue = value
             if depth is 0, update bestMove to current move
       else # player is minimizer
          if value < bestValue
             bestValue = value
             if depth is 0, update bestMove to current move
   return bestValue

Thoroughly test Minimax. During testing, add print statements during single games to verify that you evaluation function is working correctly and that Minimax is picking correct moves according to the best available value. Remember that positive values are good for the maximizer and negative values are good for the minimizer.

Once you are certain that Minimax is working properly, remove the print statements and do more extensive tests using a series of games. Even with a depth limit of 1, your MinMaxPlayer should win the significant majority of games against a random player. In general, an agent with depth D should lose or at best tie against an agent with depth D+1. You can set the depth using the -d1 and -d2 arguments (for player 1 and player 2 respectively). The following command plays a game between two minmax agents where player 1 uses depth 2 and player 2 uses depth 4:
```
./PlayGame.py mancala minmax minmax -d1 2 -d2 4 --show
```
You can also have your agent play several games (alternating sides) and report the results:
```
./PlayGame.py mancala minmax random -d1 2 -games 10
```
You should also try playing against it yourself!
```
./PlayGame.py mancala minmax human -d1 4 --show
```
Implement the basicEval method for Breakthrough. Be sure to retry the tests above on this second game.

Better Static Evaluation Function

Come up with an improved static evaluator for the breakthrough game and implement it in the file Heuristics.py.

Your goal is to create an evaluator that will beat the basicEval function when tested using minimax players with equal depth limits.

Your static evaluators should:

Not consider the past or future, but just evaluate the current board.
Be efficient to calculate.
Be deterministic (always returns the same value for the same board).
Score terminal states correctly with maximal positive values for the first player and maximal negative values for the second player.
Generate scores that are correlated with the chances of winning.

Write a clear and thorough comment with your betterEval method to describe how it works. If you add helper functions, be sure to include comments describing these as well. You can tell your agent which static evaluator to use via the -e1 and -e2 command line options. The following runs agents that both search to depth 3, but use different static evaluators:

./PlayGame.py mancala minmax minmax -d1 3 -d2 3 -e1 better -e2 basic

Alpha-Beta Pruning

Next, you should implement Minimax search with alpha-beta pruning in the PruningPlayer class. Note that alpha-beta pruning should always return the same moves that Minimax would, but it can potentially do so much more efficiently by cutting off search down branches that will not change the outcome of the search. You should make sure that your agents are exploring moves in the same order and breaking ties in the same way so that you can check the correctness of alpha-beta pruning by comparing it to standard Minimax. You can run two games between your pruning and Minimax players as follows:

./PlayGame.py mancala minmax pruning -games 2

Note that these agents should be equally matched; pruning should just make decisions faster.

Tournaments

Next week, we will hold two tournaments among the agents you submit (one for each game). I will run your TournamentPlayer agents against those of your classmates.

The BreakthroughTournamentPlayer is already set up to use your PruningPlayer and BreakthroughBetterEval implementations. All you need to do is fill in your team's usernames. For the Breakthrough tournament, all agents are required to search to depth 4. The goal of this competition is to come up with the best possible static evaluation function.

The MancalaTournamentPlayer is not yet implemented, and you may implement it to search to any depth (or even use iterative deepening). In this contest, you are encouraged to explore trade-offs between time spent searching and time spent on static evaluation.

Both of your tournament agents must implement non-randomized algorithms and run on a single CPU. Your agents will be limited to ten seconds per move (if they take longer, a random move will be chosen).

None of the points for the assignment depend on outcomes in the tournaments; they are strictly for fun. I encourage you to submit agents for both tournaments, even if you don't think your implementation is particularly exciting. The tournaments are more fun when more people play, and you might be surprised how effective simple strategies can be!

Submitting your code

Before submitting, ensure that you have added your name(s) to the top-level comments. To submit your code, you need to use git to add, commit, and push the files that you modified.

Lab 3: Minimax Search and Alpha-Beta PruningDue Feb. 20 by midnight