Computational Models of Language Homework 2

Computational Models of Language
Spring 2001
Homework 2: Experimenting with Backpropagation and Multi-Layer Networks
Due: Wednesday, February 14 by midnight

GETTING STARTED

Try the exercise involving XOR described in chapter 5 of the modeling text.

DESCRIPTION

For this homework we will experiment with the learning rate, momentum, and hidden layer size to get a feel for how these parameters affect backpropagation results.

INSTRUCTIONS

  1. Open up the XOR project from the Chapter 5 folder for the Tlearn software. Under training options, set the following parameters: the number of sweeps to 50000, seed randomly, train randomly, and halt if RMS error falls below 0.1. We will vary the learning rate and momentum to try to find the best combination. For each combination we will try three different training trials and record the number of sweeps needed to learn the task. Then we will calculate the averages and compare. For some combinations, backpropagation may not be able to converge on an answer. Simply record the maximum number of sweeps for these cases. Feel free to split up the experiments with other students and share your results.

    Which combinations seem to be the best? Why do you think this is the case?

    Learning Rate Momentum Trial 1 Sweeps Trial 2 Sweeps Trial 3 Sweeps Avg Sweeps
    0.10
    0.10.1
    0.10.25
    0.10.5
    0.10.75
    0.11.0
    0.250
    0.250.1
    0.250.25
    0.250.5
    0.250.75
    0.251.0
    0.50
    0.50.1
    0.50.25
    0.50.5
    0.50.75
    0.51.0
    0.750
    0.750.1
    0.750.25
    0.750.5
    0.750.75
    0.751.0
    1.00
    1.00.1
    1.00.25
    1.00.5
    1.00.75
    1.01.0

  2. For this set of experiments use the best combination of learning rate and momentum that you found above. This time we will vary the number of hidden units and keep all of the other parameters fixed. You will need to modify the configuration file to change the number of hidden units.

    Hidden Units Trial 1 Sweeps Trial 2 Sweeps Trial 3 Sweeps Avg Sweeps
    1
    2
    3
    4
    5

    Consider the table of results. Does increasing the number of hidden units seem to help the network in solving the task? Now look closely at the connection weights of several of the solutions using 5 hidden units. Describe the kinds of roles that individual hidden units are taking on to solve this task. Are the sorts of roles seen here significantly different than what we saw earlier in a network with only 2 hidden units?

TURNING IN YOUR ANSWERS

Email your answers to both ekako1 and meeden@cs by the due date.