Lab 3: Genetic Algorithm
Due Feb. 19 by midnight

Starting point code

This lab may be done alone or with a partner of your choice. Go through the following steps to setup your directory for this lab. This is the same sequence of steps you did last time, except using a different lab number.

  1. First you need to run setup63 to create a git repository for the lab. If you want to work alone do:
    setup63 labs/03 none
    If you want to work with a partner, then one of you needs to run the following while the other one waits until it finishes.
    setup63 labs/03 partnerUsername
    Once the script finishes, the other partner should run it on their account.

  2. For the next step only one partner should copy over the starting point code.
    cd ~/cs63/labs/03
    cp -r ~meeden/public/cs63/labs/03/* ./
    
    This will copy over the starting point files for this lab.

  3. Whether you are working alone or with a partner, you should now add all of the files to your git repo, commit, and push them as shown below.
    git add *
    git commit -m "lab3 start"
    git push
    

  4. If you are working with a partner, your partner can now pull the changes in.
    cd ~/cs63/labs/03
    git pull
    
NOTE: The ga.py file is the only file you will modify for this lab.

Introduction

A genetic algorithm is an abstract simulation of some of the key ideas from biological evolution. Initially a population of candidate solutions is randomly generated. Then evolution proceeds as a sequence of generations. In each generation, the fitness of every candidate solution in the population is determined. Next, parents are selected from the population based on their fitness. With some probability, a crossover operation is done on these parents. Crossover is a simple model of sexual reproduction that recombines the candidate solutions of two parents to produce two children. Finally, with some probability, mutation may occur on each child's candidate solution. In this way, a new population is produced on each generation. Over many generations, the average fitness of the population tends to increase and more highly fit candidate solutions are created.

You have been given a skeleton definition of a GeneticAlgorithm class that you will complete. Then you will test the completed genetic algorithm on a simple problem of finding candidate solutions that maximize the sum of their values.

Once you are convinced that the genetic algorithm is working properly, you will then apply it to the more interesting problems N-Queens and finding Pacman paths in an open maze.

Implement the GeneticAlgorithm class

Open the file ga.py. Read through the methods that you will need to implement. Build your implementation incrementally, testing each new set of methods as you go.

  1. Start by implementing the constructor, and the methods initializePopulation, and evaluatePopulation.

    To test them, write a main program that creates an instance of the SumGA class where the length of the candidate solutions is 10 and the population size is 20. Then call the methods that you implemented. Next, add a for loop to print out each member of the population, and its fitness. Finally, print out the class variables you created to track the best ever fitness and best ever candidate solution. Ensure that they have been set correctly based on your initial population.

  2. Next implement the three main operators used in a genetic algorithm: selection, crossover, and mutation. Below is a review of how each of these operators function.

  3. Implement the method oneGeneration that replaces the current population with a new population generated by applying the three operators: selection, crossover, and mutation. Pseudocode is shown below:
    initialize the newPopulation to an empty list
    loop until the size of newPopulation >= the desired popSize
       select parent1
       select parent2
       child1, child2 = crossover(parent1, parent2)
       mutate child1
       mutate child2
       add child1 and child2 to the newPopulation
    if size of newPopulation is > desired popSize
       randomly choose one child to remove
    population = newPopulation
    increment generation counter
    

    Test this method and see if the average fitness of the next population is higher than the average fitness of the previous population.

  4. Implement the method evolve that repeatedly calls oneGeneration until the maximum number of generations has been reached or the isDone method returns True. Pseudocode is shown below:
    set class variables given in parameters
    initialize a random population
    evaluate the population
    while generation < maxGeneration and optimal solution not found
       run one generation
       evaluate the population
    return the best ever individual and best ever fitness
    

  5. Finally implement the method plotStats to create a graph showing the average and best fitness values for each generation of one run of evolution.

    The above graph was made using the pylab library. You may use this or any other graphing library you prefer. The example graph was made using the pylab functions listed below:

    To read the documentation on these functions do the following:

    python
    >>> import pylab
    >>> help(pylab.plot)
    ...
    
Remember to use git to add, commit, and push your updates before going on to the next exercise.

Evolve Sum of Bits

Now it is time to test whether you can evolve a solution of all 1's using the SumGA class. In your main program, create an instance of this class with candidate solutions of length 15 and a population of size 50. Perform evolution with a maximum generation of 50, a 60% chance of crossover, and a 1% chance of mutation.

If you cannot consistently evolve a solution of all 1's, then there may be a problem with your genetic operators or with your methods that control the evolutionary process. On average, my implementation finds a solution to this problem in about 22 generations.

Evolve N-Queens

Open the file nqueens.py and read through the implementation of the NQueens class. The constructor allows you to create an N-Queens board in two ways. For our GA, we will be passing in a list of column positions for each queen. The length of the given list will determine the size of the board. If you execute this file, you will see that several boards of different sizes are created, printed, and the number of safe queens is also printed.

The NQueensGA class has been defined for you. Each candidate solution will be a list of integers in the range 0 to the length of the list minus one. The fitness method takes a given candidate solution and creates an instance of the NQueens class, and then returns the number of safe queens on the board as the fitness. The isDone method returns True when the fitness of the best individual matches the length of the candidate solutions.

In your main program, create an instance of the NQueensGA class with candidate solutions of length 8 and a population of size 75. Perform evolution with a maximum generation of 200, a 60% chance of crossover, and a 10% chance of mutation. Notice that the mutation rate has been increased. For this problem a higher mutation rate is necessary for success. There may be runs where your GA is unable to find the optimal solution, however, it should be successful most of the time.

Evolve Pacman paths

For this problem, you will be evolving a fixed-length list of actions (where each actions is 'North', 'South', 'East', or 'West'). Pacman will be using the maze called smallOpen that has dimensions 10x5. This maze is completely open, and is full of food. Pacman will be given at most 50 actions to try to maximize his score. If any of his actions lead him into a wall, then his trial will end immediately.

The PacmanGA has been defined for you. The fitness method takes a candidate solution (i.e. a list of 50 actions), and sends them to a ListAgent that executes them in the smallOpen maze with graphics turned off. By turning graphics off we can significantly speed up the running time of the GA. Pacman will invariably run into a wall at some point, which raises an exception for an illegal action. His score will be returned and used as the fitness for that candidate solution.

In your main program, use the following code to set up the evolution for Pacman.

global args
args = pacman.readCommand( sys.argv[1:] )
args['catchExceptions'] = True
# turn graphics off during evolution
origDisplay = args['display']
args['display'] = textDisplay.NullGraphics()

gameGA = PacmanGA(50, 50, ['North', 'South', 'East', 'West'])
bestChromo, bestScore = gameGA.evolve(50, 0.6, 0.01)

# turn graphics on to test the best path found
args['display'] = origDisplay
args['pacman'] = ListAgent(actions = bestChromo)
games = pacman.runGames( **args )
print "\nBest Chomosome found:"
print bestChromo
print "Best Score:", bestScore

My implementation typically finds a best path with a score close to 300. Try experimenting with the parameters of the GA to see if you can consistently get better scores. Adjustments to try:

Putting it all together

Finally, create a main program that provides a menu (as shown below), allowing you to run the problem of your choice. After each run, your program should plot the average and best fitness scores for that run.

Genetic Algorithm

Select problem type
1: Sum of bits
2: N-Queens
3: Pacman
Enter choice: 1

Submitting your code

To submit your code, you need to use git to add, commit, and push the files you modified. You should only have changed ga.py.
cd ~/cs63/labs/03
git add ga.py
git commit -m "the latest version"
git push